PHD Thesis Burckner Zeilinger
PHD Thesis Burckner Zeilinger
Dissertation zur Erlangung des akademisches Grades eines Doktors der technischen Wissenschaften unter der Leitung von
Gef ordert vom Fonds zur F orderung der wissenschaftlichen Forschung, Projekt Nr. S6502 und F1506
Contents
Introduction
11
11
Unbestimmtheit vs Unbekanntheit in a Quantum Experiment . 12 Conceptual Inadequacy of the Shannon Information in a Quantum Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.2.1 1.2.2 1.2.3 An Operational Approach . . . . . . . . . . . . . . . . . . 17 An Axiomatic Approach . . . . . . . . . . . . . . . . . . . 22 A Physical Approach . . . . . . . . . . . . . . . . . . . . . 30
1.3
43
A Qubit Carries One Bit . . . . . . . . . . . . . . . . . . . . . . . 44 2.1.1 2.1.2 Complementary Propositions . . . . . . . . . . . . . . . . 44 Invariant Information in a Qubit . . . . . . . . . . . . . . 48
2.2
Two Qubits Carry Two Bits Entanglement . . . . . . . . . . . 53 2.2.1 2.2.2 Pairs of Complementary Propositions . . . . . . . . . . . 53 Invariant Information in Two Qubits . . . . . . . . . . . . 57
2.3
71
71
The Principle of Quantization of Information . . . . . . . . . . . 71 The Number of Mutually Complementary Propositions . . . . . . 77 Maluss Law in Quantum Mechanics . . . . . . . . . . . . . . . . 81 The deBroglie Wavelength . . . . . . . . . . . . . . . . . . . . . . 89 Dynamics of Information . . . . . . . . . . . . . . . . . . . . . . . 93 Linearity and Arbitrarily Fast Communication . . . . . . . . . . 99 Change of Information in Measurement Reduction of the Wave Packet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
B.1 Continuity of Information Implies Analyticity of Information . . 113 B.2 A General Transformation in the Space of Information . . . . . . 115 Conclusions Preprint from Phys. Rev. Lett. References 117 121 127
Zusammenfassung
In jedem m oglichen Quantenexperiment ist eine endliche Anzahl von unterschiedlichen Resultaten, z.B. die einzelnen Spinresultate: Spin hinauf und Spin hinunter, m oglich. Bevor das Experiment durchgef uhrt wird, kennt ein Beobachter nur die spezischen Wahrscheinlichkeiten aller m oglichen einzelnen Resultate. Wir denieren ein neues Informationma f ur eine einzelne Messung. Dieses basiert auf der Tatsache, da in einer einzelnen Quantenmessung die einzigen Eigenschaften des Systems, die vor der Durchf uhrung der Messung deniert sind, die spezischen Wahrscheinlichkeiten f ur alle m oglichen einzelnen Resultate sind. Nach der Kopenhagener Deutung der Quantenmechanik, die besonders von Niels Bohr ausgearbeitet wurde, macht es keinen Sinn, von der Eigenschaft eines Quantensystems unabh angig von dem Versuchsaufbau, in dem sich diese Eigenschaft manifestiert, zu reden. Dem Beobachter steht es jedenfalls frei, unterschiedliche Versuchsanordnungen zu w ahlen, die einander sogar vollst andig ausschliessen k onnen, z.B. die Messung der orthogonalen Komponenten des Spins. Diese Quantenkomplementarit at von Variablen tritt auf, wenn die entsprechenden Operatoren nicht kommutieren. Eine Variable, z.B. eine Komponente des Spins, kann auf Kosten von maximaler Ungewiheit u ber die anderen orthogonalen Komponenten pr azise deniert werden. Wir denieren den Gesamtinformationsgehalt eines Quantensystems als die Summe der Informationmae einzelner Variablen eines vollst andigen Satzes sich gegenseitig vollst andig ausschlieender (komplement arer) Variablen. Der Beobachter kann sich entscheiden, einen anderen Satz komplement arer Variablen zu messen und gewinnt folglich Kenntnis u ber eine oder mehrere Variablen auf Kosten geringerer Kenntnis u ber andere. Im Fall der Spinmessungen k onnten jene die Projektionen entlang gedrehter Richtungen sein, in denen die Ungewiheit in einer Komponente verringert wird und in einer anderen Komponente (oder mehreren Komponenten) entsprechend erh oht wird. Intuitiv erwartet man, da die Gesamtungewiheit, oder gleichwertig die Gesamtinformation, die in dem System enthalten ist, unter einer solchen Transformation
von einem vollst andigen Satz komplement arer Variablen zu einem anderen unver andert bleibt. Wir zeigen, da die Gesamtinformation eines Systems, die unserem neuen Ma entsprechend deniert ist, genau diese Invarianzeigenschaft hat. Wir deuten das Bestehen dieser Eigenschaft der Gesamtinformation als Indiz, da in der Quantenmechanik die Information der grundlegendste Begri ist. Im ersten Teil der vorliegenden Arbeit zeigen wir, gegr undet auf den Ergebnissen der Quantentheorie, die G ultigkeit der Invarianzeigenschaft der Gesamtinformation und schlagen Ideen f ur das grundlegende Prinzip der Quantenmechanik vor. Im zweiten Teil argumentieren wir f ur ein neues Grundprinzip der Quantenmechanik, das davon ausgeht, da das elementarste System durch ein Bit an Information gekennzeichnet ist. Ebenso stellt ein zusammengesetztes System, das beispielweise aus zwei Elementarsystemen besteht, zwei Bits dar. Von diesem Grundprinzip ausgehend, leiten wir dann einige wesentliche Elemente der logischen Struktur der Quantentheorie ab. Die Gesamtinformation eines System (bestehend aus einer endlichen Anzahl von Bits) manifestiert sich nur in bestimmten Messungen. Da ein Quantensystem nicht mehr Information tragen kann als in den Bits enhalten ist, ist der Zufallscharakter der einzelnen Resultate in den anderen (komplement aren) Messungen dann eine notwendige Konsequenz. Diese Art des Zufallscharakters ist nicht reduzierbar, d.h. er kann nicht auf verborgene Eigenschaften des Systems zur uckgef uhrt werden. Andernfalls w urde das Elementarsystem mehr Information als ein Bit tragen. Die nat urlichste Funktion zwischen der Wahrscheinlichkeit f ur das Auftreten eines spezischen Resultates und der Laborparameter, die mit dem Grundprinzip, da ein Elementarsystem nur ein Bit an Information tr agt, vereinbar ist, mu die sinusf ormige Abh angigkeit sein. Verschr ankung resultiert aus der Tatsache, da Information eines zusammengesetzten Mehrteilchensystems auf gemeinsame Eigenschaften verteilt werden kann. F ur ein Zweiteilchensystem beispielweise erhalten wir maximale Verschr ankung dann, wenn die zwei Bits, um gemeinsame Eigenschaften zu spezizieren, ersch opft worden sind, und keine weitere M oglichkeit mehr existiert, Information in den Einzelteilchen zu verschl usseln.
Abstract
A new measure of information in quantum mechanics is proposed which takes into account that for quantum systems, the only feature known before an experiment is performed are the probabilities for various events to occur. The sum of the individual measures of information for mutually complementary observations is invariant under the choice of the particular set of complementary observations and conserved in time if there is no information exchange with an environment. This operational quantum information invariant results in k bits of information for a system consisting of k qubits. For a composite system, maximal entanglement results if the total information carried by the system is exhausted in specifying joint properties, with no individual qubit carrying any information on its own. Our results we interpret as implying that information is the most fundamental notion in quantum mechanics. Based on this observation we suggest ideas for a foundational principle for quantum theory. It is proposed here that the foundational principle for quantum theory may be identied through the assumption that the most elementary system carries one bit of information only. Therefore an elementary system can only give a denite answer in one specic measurement. The irreducible randomness of individual outcomes in other measurements and quantum complementarity are then necessary consequences. The most natural function between probabilities for outcomes to occur and the experimental parameters, consistent with the foundational principle proposed, is the well-known sinusoidal dependence.
Introduction
The ongoing debate about the interpretation of quantum mechanics, including the meaning of specic phenomena like the measurement problem, indicate that the foundations of quantum theory are not understood to the same degree as those of classical mechanics or special relativity. While the basic concepts of classical mechanics coincide well with our intuition, special relativity is out of our immediate insight. Yet this theory is based on the principle of relativity, which asserts that the laws of physics must be the same in all inertial systems including constancy of the speed of light. However, even as the theory itself is based on such simple and in part intuitively clear principles it nevertheless predicts some surprising and even counter-intuitive consequences. The foundational principles for special relativity imply an invariance of the specic interval (eigenzeit) between two events with respect to all inertial frames of reference. Data on pure time intervals obtained with respect to two relatively moving inertial frames of reference will dier, and so will data on spatial distances. It is possible however, to form a single expression from time intervals and space distances that will have the same value with respect to all inertial frames of reference. If the time interval between two distant events is denoted by t and their space distance from each other by l, an expression involving a quantity symbolized by s can be derived in which (s)2 equals the square of the time interval minus the fraction of distance squared over speed of light squared, (s)2 = (t)2 (l)2 /c2 . This will have the same value as (t )2 (l )2 /c2 with t and l having been obtained in another inertial frame of reference. Quantum mechanics lacks such invariants and principles to this day. Possibly the lack of generally accepted invariants and foundational principles for quantum mechanics is the main reason for the problem in understanding quantum mechanics1 and thus, for the coexistence of philosophically quite dierent
In his book [1967] Richard Feynman makes the following statement: There was a time the newspaper said that only twelve men understood the theory of relativity. I do not believe there ever was such a time. There might have been a time when only one man did, because
1
6 interpretations of the theory. In fact, we have a number of coexisting interpretations utilizing mutually contradictory concepts [Zeilinger, 1996]. A very incomplete list of the many interpretations of quantum mechanics includes the original Copenhagen Interpretation [Bohr, 1935], the ManyWorld Interpretation [Everett, 1957], the Statistical Interpretation [Ballentine, 1970], Bohms interpretation [Bohm, 1952], the Transactional Interpretation [Cramer, 1986], Consistent Histories Interpretation [Griths, 1984] and Mermins Ithaca interpretation [Mermin, 1998(a), 1998(b), 1998(c)]. In any quantum experiment with discrete variables a number of dierent outcomes are possible, for example, the individual spin outcomes spin up and spin down. Before the experiment is performed an experimentalist only knows the specic probabilities for all possible individual outcomes. In chapter 1 we dene a new measure of the experimentalists information for an individual measurement based on the fact that the only features dened before the measurement is performed are the specic probabilities for all possible individual outcomes. The observer is free to choose dierent experiments which might even completely exclude each other, for example measurements of orthogonal components of spin. This quantum complementarity of variables occurs when the corresponding operators do not commute. One quantity, for example the zcomponent of spin, might be well dened at the expense of maximal uncertainty about the other orthogonal components. In chapter 2 we dene the total information content in a quantum system to be the sum over all individual measures for a complete set of mutually complementary experiments. The experimentalist may decide to measure a dierent set of complementary variables thus gaining certainty about one or more variables at the expense of losing certainty about other(s). In the case of spin this could be the projections along rotated directions, for example, where the uncertainty in one component is reduced but the one in another component is increased correspondingly. Intuitively one expects that the total uncertainty or, equivalently, the total information carried by the system is invariant under such transformation from one complete set of complementary variables to another one. In chapter 2 we show that the total information dened according to our new measure has exactly that invariance property. We interpret the existence of that quantum information invariant as implying that in quantum mechanics information is the most fundamental notion. In the rst part From Quantum Theory to an
he was the only guy who caught on, before he wrote his paper. But after people read the paper a lot of people understood the theory of relativity in some way or other, certainly more than twelve. On the other hand, I think I can safely say that nobody understands quantum mechanics.
7 Information Invariant ... of the thesis (chapter 1 and 2) we argue, based on the known features of quantum physics, for the validity of the quantum information invariant and we suggest ideas for a foundational principle for quantum theory. In the second part ... and back of the thesis (chapter 3) we will turn the reasoning around and, based on the suggested foundational principle for quantum mechanics, derive some essential features of the logical structure of quantum theory. In a similar fashion as the foundational principles for special relativity imply invariance of the specic measure of distance (eigenzeit) in space-time with respect to all observers in inertial frames of reference, the suggested foundational principle for quantum mechanics will imply invariance of a specic operational information measure with respect to all possible observers choices for a complete set of complementary experiments. By a foundational principle we do not mean an axiomatic formalization of the mathematical foundation of quantum mechanics, but a foundational conceptual principle which answers Wheelers [1983] question Why the Quantum? This principle is then the reason for some essential features of quantum mechanics, like the irreducible randomness of an individual quantum event, quantum complementarity, sinusoidal relation between probabilities and laboratory parameters, and entanglement. In this view we will discuss precisely the empirical signicance of the terms involved in formulating quantum theory, particularly the notion of a quantum state, in a way which leads clearly to an understanding of the theory. However we are aware of the possibility that this might not carry the same degree of emotional appeal for everyone. The conceptual groundwork for the ideas presented here has been prepared most notably by Bohr [1958], von Weizs acker [1985] and Wheeler [1983].
Chapter 1
12
our ignorance, or information, as to which specic experimental result will be obtained in an individual run of the experiment plays a more fundamental role in quantum measurement than in classical measurement. Based on the fact that in an individual quantum measurement the only feature dened before the measurement is performed are probabilities for all possible individual outcomes to occur, we propose a new measure of information for an individual quantum measurement in Sec. 1.3. For clarity we emphasize that our measure of information is not equivalent to Shannons information. In fact, we show in Sec. 1.2 that because of the completely dierent root of a quantum measurement as compared to that of a classical measurement, certain conceptual diculties arise when we try to dene information gain in a quantum measurement by the notion of Shannons information. While Shannons information is applicable when a measurement reveals a pre-existing property, our measure of information takes into account that, in general, a quantum measurement does not reveal a pre-existing property.
1.1
We begin with a brief survey of the usual textbook examples. Perhaps the archetypical example is Einsteins recoiling-slit experiment [Bohr, 1949]. By this example Einstein hoped to give a gedanken double-slit experiment which would yield both which-path information and also show the wave-like interference phenomenon. In a famous paper [1949], Bohr analyzed two arrangements related to the recoiling-slit experiment. In the rst arrangement, the diaphragm placed in front of the diaphragm pierced with two slits can recoil (Fig. 1.1a) and reveal through which slit of the second diaphragm the photon reached the screen, in as much as only one of the momenta of a photon passing through one or the other slit is consistent with a known amount of recoil momentum. In the second arrangement in Fig. 1.1b, the diaphragm is xed so that the path can not be determinated. One nds that only in the latter arrangement an interference pattern is exhibited. Bohr concluded ... we are presented with a choice of either tracing the path of a particle or observing interference eects. Another example along these lines is Feynmans [Feynman et al., 1965] version of Einsteins gedanken experiment. In this scheme the interfering electron is observed by light-scattering. The scattering of a photon is used to detect the electron position just behind the slits, revealing through which slit the electron reached the screen. Feynman explained that this observation procedure destroys the interference pattern. He concluded his analysis with the following
13
a)
b)
Figure 1.1: Two mutually exclusive experimental arrangements to observe the interference pattern (Fig. a) and the path of the particle (Fig. b) in the double slit experiment. The gures are taken from [Bohr, 1949]. If the diaphragm with two slits is xed an interference pattern is exhibited as given in Fig. a). In the experimental situation in Fig. b) when the diaphragm can recoil no interference pattern is observed. Bohr [1949] writes: Since, however, any reading of the scale, in whatever way performed, will involve an uncontrollable change in the momentum of the diaphragm, there will always be, in conformity with the indeterminacy principle, a reciprocal relationship between our knowledge of the position of the slit and the accuracy of the momentum control. The lack of our knowledge of the position of the slit excludes then the appearance of the interference phenomena.
statement: If an apparatus is capable of determining which hole the electron goes through, it cannot be so delicate that it does not disturb the pattern in an essential way. No one has ever found (or even thought of) a way around the uncertainty principle. In the experimental situations discussed so far, as in most other usual textbook examples, the which-path information is obtained, exposing the interfering particle to uncontrollable scattering eects. This initiated a number of misconceptions being put forward in the literature. According to the most signicant misconception, loss of interference is due to an uncontrollable transfer of energy and/or momentum to the particle associated with any attempt to observe the particles path. Unavoidable disturbances might again be because of the intrinsic clumsiness of any macroscopic measuring apparatus. Over the last few years experiments were considered and some already performed, where the reason why no interference pattern arises is not due to any uncontrollable disturbance of the quantum system or the clumsiness of the apparatus. Rather the lack of interference is due to the fact that the quantum state is prepared in such a way as to permit path information to be obtained, in principle, independent of whether the experimenter cares to read it out or not.
14
Figure 1.2: An arrangement for two-particle interferometry. The source emits two
particles in the entangled state (1.1). Particle 1 traverses the Mach-Zehnder interferometer starting with the beams A and B while particle 2 traverses the Mach-Zehnder interferometer starting with the beams C and D. Phase shifters in both interferometers permit continuous variations of the phases 1 and 2 .
One line of such research considers the use of pairs of particles which are strongly entangled. Consider a setup where a source emits two particles with antiparallel momenta which then feed two Mach-Zehnder interferometers [Horne et al., 1989], [Rarity and Tapster, 1990], [Herzog et al., 1995] as shown in Fig. 1.2. Then whenever particle 1 is found in beam A, particle 2 is found in beam C and whenever particle 1 is found in beam B particle 2 is found in beam D. The quantum state is 1 | = (|A 1 |C 2 + |B 1 |D 2 ).
(1.1)
Will we now observe an interference pattern for particle 2, i.e. the well-known sinusoidal variation of the intensities registered in the detectors U2 and L2 upon variation of the phase 2 ? The answer has to be negative because by simply placing detectors in the beams A and B of particle 1 we can determine which path particle 1 took. The lack of interference can easily be calculated starting from the state (1.1). Yet, if we recombine the two paths of particle 1 as indicated in Fig. 1.2, and if we register both particle 1 in either detector U1 or L1 and particle 2 in either detector U2 or L2 , we have forgone any possibility of obtaining path information. Therefore we conclude an interference pattern should arise in coincidence counts between the detectors for particle 1 and for particle 2 shown in Fig. 1.2. This indeed follows from quantum mechanical calculations [Horne et al., 1989]. Another independent approach to complementarity in an interference experiment considers the use of micromasers in atomic beam experiments [Scully et al., 1991]. Typically in such an experiment, an atom passes through a cavity such that it exchanges exactly one photon with the cavity without changing momentum. Thus by investigating the cavity, one has information on whether or not an atom passed through it without inuencing the momentum of the atom. Now, if we place one cavity into the each of two paths of the interference experiment, we may obtain information on which path the atom took. The
15
interference pattern does not arise. It is the mere possibility of obtaining path information which guarantees that no interference occurs1 . On the other hand, we can read the information in the micromasers in such a way as to erase all information on which micromaser the photon has been stored in. Then we have just the information that the atom passed through the apparatus, but not along which path. In this case the atoms counted in coincidence with the photons are members of an ensemble dening an interference pattern. These two experiments underline clearly that complementary does not originate in some uncontrollable disturbance of pre-assigned properties of a quantum system in a measurement process. In fact, as theorems like those of Bell [Bell, 1964] and Greenberger, Horne and Zeilinger [Greenberger et al. 1989, 1990] show, it is in principle not possible to assign to a quantum system simultaneously properties that both correspond to complementary measurements, and which in order to be in agreement with special relativity, have to be local. The principle impossibility of local realism will now be briey demonstrated for our example of the two-particle interference experiment given in Fig. 1.2. As the two particles in our example might be widely separated, it is natural to assume validity of the locality condition suggested by EPR [Einstein, Podolsky and Rosen, 1935]: Since at the time of measurement the two systems no longer interact, no real change can take place in the second system in consequence of anything that may be done to the rst system. Then, whether detector U2 or L2 for a specic phase 2 is triggered must be independent of which measurement we actually perform on the other particle (e.g, independent of the phase 1 ) and even independent of whether we care to perform any measurement at all on that particle. This assumption implies that certain combinations of expectation values have denite bounds. The mathematical expression of that bound is called Bells inequality, of which many variants exists. For example, a version given by Clauser, Horne, Shimony and Holt [1969] is |E (1 , 2 ) E (1 , 2 )| + |E (1 , 2 ) + E (1 , 2 )| 2 where E (1 , 2 ) = P++ (1 , 2 ) + P (1 , 2 ) P+ (1 , 2 ) P+ (1 , 2 ).
Scully et al. [1991] wrote: ... it is simply the information contained in a functioning measuring apparatus that changes the outcome of the experiment, and not uncontrollable alternations of the spatial wave function, resulting from the action of the measuring apparatus on the system under observation.
1
(1.2)
(1.3)
16
For the quantum state (1.1) this becomes EQM (1 , 2 ) = cos(2 1 ), where we suppose a phase shift of i for reection and 1 for transmission at the beam splitter. Here we assume that particle 1 gives result + () when it triggers detector U1 (L1 ) and particle 2 gives result + () when it triggers detector U2 (L2 ). Then, e.g. P++ (1 , 2 ) is the joint probability that particle 1 gives + and particle 2 gives +. Maximal violation occurs for 1 = 45 , 2 = 0 , 1 = 135 , 2 = 90 , where the left-hand side of Eq. (1.2) will be 2 2 in clear violation of the inequality. Thus, the assumption of local realism is in conict with quantum physics itself. From this we learn that we cannot speak of complementarity as a consequence of some disturbance of a system in the measurement if there are no objective properties to disturb2 . An important feature of the analysis so far is that we have to base our concept of complementarity on the much more fundamental concept of information. Any rm foundation of complementarity has to make recourse to the property of mutual exclusiveness of dierent classes of information of a quantum system. As stated by Pauli [1958] in the analysis of the uncertainty relations3 : ... diese Relationen enthalten die Aussage, da jede genaue Kenntnis des Teilchenortes zugleich eine prinzipielle Unbestimmtheit, nicht nur Unbekanntheit des Impulses zur Folge hat und umgekehrt. Die Unterscheidung zwischen (prinzipieller) Unbestimmtheit und Unbekanntheit und der Zusammenhang beider Begrie sind f ur die ganze Quantentheorie entscheidend. We note that a view of information as the most fundamental concept in quantum mechanics also leads to the most natural understanding of new phenomena in quantum computation [Barenco et al., 1995(a)], entanglement swap ping [Zukowski et al., 1993], [Pan et al., 1998], quantum cloning [Wootters and Zurek, 1982], [Bu zek and Hillery, 1996] and quantum communication such as quantum dense coding [Mattle et al., 1996], quantum cryptography [Bennett et al., 1992] and quantum teleportation [Bennett et al., 1993], [Bouwmeester et. al, 1997].
Bohr dislikes phrases like disturbing phenomena by observations exactly because of their potential for confusion. He stresses [Bohr, 1958] the use of the word phenomenon exclusively to refer to observations made under specic circumstances, including an account of the whole experimental arrangement. 3 Translated:... this relations contain the statement that any precise knowledge of the position of a particle implies a fundamental indeniteness, not just an unknownness, of the momentum for a consequence and vice versa. The distinction between (fundamental) indefiniteness and unknownness, and a connection of these two notions is decisive for the whole quantum theory.
2
17
1.2
Shannons measure of information is generally considered to be very useful to describe information in a physical observation. Here we will see that, while this is rather natural in classical physics, it becomes problematic and even untenable in quantum physics. There are various ways to motivate the Shannon measure of information. In an operational approach Shannons information is introduced as the expected minimal number of binary questions, i.e. questions with yes or no answers only, required to discern the outcome of an experiment. In an axiomatic approach the Shannon measure is uniquely specied by Shannons postulates which establish some intuitively clear relations between individual amounts of information gained in dierent individual observations. And in a physical approach Shannons information is characterized in terms of some natural properties which are essential from the point of view of the physics considered. When investigating these three approaches in the next sections we will notice that each approach contains an element that escapes complete and full description in quantum mechanics. This element is always associated with the objective randomness of individual quantum events and with quantum complementarity.
1.2.1
An Operational Approach
For classical observations Shannons information can be strengthened through an operational approach to the question. To carry this out, consider the following example. An urn is lled with colored balls. The proportions in which the dierent colors are present is known. Now the urn is shaken, and we draw a single ball. To what extent can we predict the color of the drawn ball? If all the balls in the urn are of the same color, we can completely predict the outcome of the draw. On the other hand, if the various colors are present in equal proportions, we are completely uncertain about the outcome. One can think of these situations as extreme cases on a varying scale of predictability. As a specic example consider an urn containing balls of four colors: black, 1 1 1 white, red, and green, with the proportions p1 = 1 2 , p2 = 4 , p3 = 8 and p4 = 8 , respectively. Suppose now that one wishes to learn the color of the drawn ball by asking questions to which only yes or no can be given as an answer. Of course, the number of questions needed will depend on the questioning strategy
18
Figure 1.3: Binary question tree to determine the color of a drawn ball. The pro1 portions in which black, white, red and green colors are present are p1 = 1 2 , p2 = 4 , 1 1 p3 = 8 and p4 = 8 , respectively. adopted. In order to make this strategy the most optimal, that is, in order that we can expect to gain from each yes-or-no question maximal information, we evidently have to ask questions whose answers will strike out half of the possibilities. Indeed, a good question to start with is to ask Is the color of the drawn ball black? (Fig. 1.3), the virtue being that, regardless of the answer yes or no, we will be able to strike out a weighted half of the possibilities. If the answer is yes, then we are done. If the answer is no, one may divide the set that remains after this rst round into two parts of equal probability {white} and {red, green} and proceed by posing the question Is the color of the drawn ball white?. Again, if the answer is yes, we are done, and if the answer is no we proceed in a similar fashion until the identity of the outcome is at hand. A particular outcome is specied by writing down, in order, the yess and nos encountered in travelling from the root to the specic leaf of the tree schematically depicted in Fig. 1.3. It is easy to see that following the above optimal strategy the mean minimal number of binary questions needed to determine the color of the drawn ball is p1 1 + p2 2 + (p3 + p4 ) 3 = Notice that this may be written as 1 1 1 1 1 1 1 1 log log log log = 2 2 4 4 8 8 8 8 where the logarithm is taken to base 2. Now of course for an arbitrary probability distribution p1 , p2 , p3 and p4 over a set of colors, a division into two sets of equal probability is not always
4
1 1 1+ 2+ 2 4
1 1 7 3= . + 8 8 4
pi log pi .
i=1
19
possible. One may then consider a generalized situation where we draw a ball N times without replacing the drawn ball. We assume again that we wish to learn the colors of N drawn balls by asking questions to which only yes or no can be given as an answer. Now, however, questions of a mixed type may be asked, like Is the color of the rst drawn ball black or white, of the second drawn ball red, ..., and of the N th black or white or green?. In this manner it becomes easier to nd questions for which the probability of yes and no are approximately equal, and thus the total number of questions needed can be reduced. Suppose p1 N , p2 N , p3 N and p4 N are all integers, then the probability of obtaining the sequence containing p1 N black balls, p2 N white balls, p3 N red balls and p4 N green balls is [Shannon, 1949]
p1 N p2 N p3 N p4 p(sequence) = pN p2 p3 p4 = 1
1 2N H
where H=
4
pi log pi
i=1
(1.4)
is the Shannon information expressed in bits when the logarithm is taken to base 2. Such a sequence is called typical sequence4 . Notice that a particular typical sequence is specied by the particular order of the balls distinguishable by the particular color sequence. The total number of typical sequences can be obtained as the number of distinguishable permutations of N balls made up of 4 groups of black, white, red and green balls indistinguishable within each group. If N is suciently large then N! 2N H , (p1 N )!(p2 N )!(p3 N )!(p4 N )!
(1.5)
where we use the Stirling approximation N ! 2N N N eN . Hence, the typical sequences all have equal probability, and there are 2N H of them. Let us now turn back to our problem. We wish to learn colors of N drawn balls by asking questions to which only yes or no can be given as the answer.
4
To be specic, we dene the set of typical sequences to be all sequences such that 2N (H +
)
p(sequence) 2N (H
> 0.
Now, it can be shown that the probability that N outcomes actually form a typical sequence is greater than 1 , for suciently large N , no matter how small might be.
20
Figure 1.4: Binary question tree to determine the specic sequence of outcomes (color of the drawn balls) in a suciently large number N of experimental trials (number of drawings). An urn is lled with black and white balls with proportions p1 and p2 , respectively. The expected number of questions needed to determine the actual sequence of outcomes is N H , where H = p1 log p1 p2 log p2 . If we address this problem in a piece-wise manner, determining the colors of the drawn balls one after another, the number of questions needed will just be N times that needed for a single ball. However we may use another strategy. Suppose N is suciently large that the sequence of N drawn balls contains close to p1 N black balls, p2 N white balls, p3 N red balls and p4 N green balls. In other words, suppose N drawn balls form a typical sequence. Now, in order to learn the colors of the drawn balls we need only to identify which particular typical sequence is actually drawn. Since there are 2N H possible typical sequences and all of them have equal probability to be drawn, the minimal number of yes-no questions needed is just N H . Or equivalently, the Shannon information5 expressed in bits is the minimal number of yes-no questions necessary to determine which particular sequence of outcomes occurs, divided by N [Feinstein, 1958], [Unk, 1990]. This is known as the noiseless coding theorem. An explicit example with an urn containing balls of two dierent colors is given in Fig. 1.4. A generalization
5 The Shannon information therefore refers to the information about an individual outcome of an experiment. This should be contrasted to the cases where the notion of information refers to knowledge about an unknown parameter in a probability distribution [Fisher, 1925], or the information for discriminating between two probability distributions [Kullback, 1959], or the information that one event provides about another event [Gelfand and Yaglom, 1957].
21
for the probability distribution p1 , p2 , ..., pn over a nite set of n colors may easily be obtained. We now analyze Shannons notion of information in a quantum measurement. In particular we consider a beam of photons prepared with vertical polarization and analyzed by a lter polarized at an angle of 45 from the vertical position. Each individual photon, when it encounters the polarization lter, has exactly two equally probable options: to pass straight through the lter (we call this the outcome 1) or to be absorbed by the lter (the outcome 0). Now suppose we perform the polarization experiment a suciently large number N of times so that the sequence of actual outcomes forms a typical sequence. We observe a particular sequence of 1s and 0s. An individual outcome observed in a single experimental trial is fundamentally random and cannot be assumed to reveal the property of an individual photon, assigned before the measurement is performed, to pass through the lter or to be absorbed by the lter. The principal indeniteness, in the sense of fundamental nonexistence of a detailed description of and prediction for the individual quantum event resulting in the particular measurement result, implies that the particular outcome sequence of 1s and 0s specied by writing down, in order, the yess and nos encountered in a row of yes/no questions asked is not dened before the measurement is performed. This implies that Shannons information dened as the number of yes/no questions needed to determine the particular order of 1 and 0s in the actual sequence of outcomes cannot be assumed to describe our ignorance about the future measurement results that is given before the measurements are performed and that is then removed after the measurements are performed, because no individual outcome and consequently no particular order of 1s and 0s we observe in the sequence of measurements is dened before the measurements are performed. Of course, after the measurement is performed and its actual result becomes known the information necessary to specify the measurement result is quantied by the Shannon measure of information. Yet, this information has no reference to the particular experimental situation given before the experiment is performed and therefore it is not appropriate to dene the information about the system that is gained by the performance of the experiment. In the sense that an individual quantum event manifests itself only in the measurement process and is not precisely dened before measurement is performed, we may speak of a creation of Shannons information in the measurement. In our explicit example, the amount of created information is maximal because vertical polarization and polarization at 45 are maximally complementary attributes. It is interesting to contrast this with Shannons [1949] writing of information as
22
being produced by a source. The Shannon information is surely adequate for the situation in classical physics where we can always mentally split the ensemble into its constituents and where the stochastic behavior of the whole ensemble follows from the behavior of its intrinsic dierent individual constituents which can be thought of as being dened to any precision. In classical physics, this can be done even in situations where we have no way to distinguish the individual constituents and their behavior experimentally. If we perform a sequence of measurements on the ensemble, a particular order of individual events that is recorded is predetermined and originates in the intrinsic properties individual constituents possess before measurements. The Shannon information may then be assumed to measure the information necessary to reveal the property of an individual system of the ensemble given before measurements are performed. Again this cannot be assumed in a quantum measurement, because a quantum measurement, with the only exception being that of the system in an eigenstate of the measured observable, changes the state of the system into a new state in a fundamentally unpredictable way, and thus cannot be claimed to reveal a property existing before the measurement is performed. In fact, as theorems like those of Kochen-Specker [Kochen and Specker 1967] show, in quantum mechanics it is not possible, not even in principle, to assign to a quantum system properties corresponding to all possible measurements.
1.2.2
An Axiomatic Approach
An important reason for preferring the Shannon measure of information in the literature lies in the fact that it is uniquely characterized by Shannons intuitively reasonable postulates, and that alternative expressions should be rejected for that reason. This has been expressed strongly by Jaynes [1957] in words: One ... important reason for preferring the Shannon measure is that it is the only one that satises ... [Shannons postulates]. Therefore one expects that any deduction made from other information measures, if carried far enough, will eventually lead to contradiction. A good way to continue our discussion is by reviewing how Shannon, using his postulates, arrived at his famous expression. He writes [1949]: Suppose we have a set of possible events whose probabilities of occurrence are p1 , p2 , ..., pn . These probabilities are known but that is all we know concerning which event will occur. Can we nd a measure of how much choice is involved in the selection of the event or how uncertain we are of the outcome? If there is such a measure, say H (p1 , p2 , ..., pn ), it is reasonable to require of
23
1 2 1 3 1 6
1 2
Figure 1.5: Decomposition of a choice from three possibilities. Figure taken from
[Shannon, 1949].
3. If a choice be broken down into two successive choices, the original H should be the weighted sum of the individual values of H . The meaning of this is illustrated in Fig. 1.5. At the left we have three possibilities p1 = 1 2, 1 1 p2 = 3 , p3 = 6 . On the right we rst choose between two possibilities each with probability 1 2 , and if the second occurs make another choice 1 with probabilities 2 , . 3 3 The nal results have the same probabilities as before. We require, in this special case, that H 1 1 1 , , 2 3 6 =H 1 1 1 + H , 2 2 2 2 1 . , 3 3
The coecient 1 2 is the weighing factor introduced because this second choice occurs half the time. Shannon then shows that only the function (1.4) satises all three postulates. It is clear from the way Shannon formulated the problem, that H is introduced as an uncertainty about the outcome of an experiment based on a given probability distribution. The uncertainty arises, of course, because the probability distribution does not enable us to predict exactly what the actual outcome will be. This uncertainty is, of course, removed when the experiment is performed and its actual outcome becomes known. Thus, we may think of H as the amount of information that is gained by the performance of the experiment. We now turn to the discussion of Shannons postulates. While the rst two postulates are purely qualitative and natural for every meaningful measure of information, the last postulate might appear to have no immediate intuitive
24
appeal. The third Shannon postulate originally formulated as an example was reformulated as an exact rule by Faddeev [1957]: For every n 2 H (p1 , .., pn1 , q1 , q2 ) = H (p1 , .., pn1 , pn )+ pn H where pn = q1 + q2 . Without physical interpretation the recursion postulate (1.6) is merely a mathematical expression which is certainly necessary for the uniqueness of the function (1.4) but has no further physical signicance. We adopt the following well-known interpretation [Unk, 1990], [Jaynes, 1996]. Assume the possible outcomes of the experiment to be a1 , ..., an and H (p1 , ..., pn ) to represent the amount of information that is gained by the performance of the experiment. Now, decompose event an into two distinct events an b1 and an b2 ( denotes and, thus a b denotes a joint event). Denote the probabilities of outcomes an b1 and an b2 by q1 and q2 , respectively. Then the left-hand side H (p1 , ..., pn1 , q1 , q2 ) of Eq. (1.6) represents the amount of information that is gained by the performance of the experiment with outcomes a1 , ..., an1 , an b1 , an b2 . When the outcome an occurs, the conditional probabilities for q1 q2 b1 and b2 are p and p respectively and the amount of information gained n n
q1 q2 by the performance of the conditional experiment is H p , . Hence the n pn recursion requirement states that the information gained in the experiment with outcomes a1 , ..., an1 , an b1 , an b2 equals the sum of the information gained in the experiment with outcomes a1 , ..., an and the information gained in the conditional experiment with outcomes b1 or b2 , given that the outcome an occurred with probability pn . This interpretation implies that the third postulate can be rewritten as
q1 q2 , , pn pn
(1.6)
H (p(a1 ), ..., p(an1 ), p(an b1 ), p(an b2 )) = H (p(a1 ), ..., p(an1 ), p(an ))+ p(an )H (p(b1 |an ), p(b2 |an )), where p(an ) = p(an b1 ) + p(an b2 ), p(an b1 ) = p(an )p(b1 |an ) and p(an b2 ) = p(an )p(b2 |an ).
(1.7)
Here p(bi |an ) i = 1, 2 denotes the conditional probability for outcome an given the outcome bi occurred and p(an bi ) denotes the joint probability that outcome an bi occurs. If we analyze the generalized situation with n outcomes ai of the rst experiment A, m outcomes bj of the conditional experiment B and mn outcomes
25
ai bj of the joint experiment A B , we may then rewrite the recursion postulate in a short form as H (A B ) = H (A) + H (B |A) (1.8)
where H (B |A) = n j p(aj )H (b1 |aj , ..., bm |aj ) is the average of information gained by observation B given that the conditional outcome aj occurred weighted by probability p(aj ) for aj to occur. It is essential to note that the recursion postulate is inevitably related to the manner in which we gain information in a classical measurement. In fact, in classical measurements it is always possible to assign to a system simultaneously attributes corresponding to all possible measurements, here ai , bj and ai bj . Also, the interaction between measuring apparatus and classical system can be thought to be made arbitrarily small so that the experimental determination of A has no inuence on our possibility to predict the outcomes of the possible future experiment B . In conclusion, the information expected from the joint experiment A B is simply the sum of the information expected from the rst experiment A and the conditional information of the second experiment B with respect to the rst, as predicted by Eq. (1.8). In contrast we know that in a quantum measurement it is not possible to assign to a system simultaneously complementary attributes, like position and momentum, or the path of the system and the position of appearance in the interference pattern in the double-slit experiment, or the spin values along orthogonal directions. Therefore Shannons crucial third postulate (1.8) necessary for uniqueness of Shannons measure of information is not well-dened in quantum mechanics when A and B are measurements of mutually complementary attributes. Consequently, the Shannon measure loses its preferential status with respect to alternative expressions when applied to dene information gain in quantum measurements. Here a certain misconception might be put forward that arises from a certain operational point of view. According to that view, for example, complementarity between interference pattern and information about the path of the system in the double-slit experiment arises from the fact that any attempt to observe the particle path would be associated with an uncontrollable disturbance of the particle. Such a disturbance in itself would then be the reason for the loss of the interference pattern. In such of view it would be possible to dene Shannons information for all attributes of the system simultaneously, and the third Shannon postulate would be violated because of the unavoidable disturbance of the system occurring whenever the subsequently measured property B is incompatible with the previous one A. Yet, this is a misconception not only because it
26
was shown [Bell, 1964], [Greenberger et al. 1989, 1990] that it is in principle impossible to assign to a quantum system simultaneously observation-independent properties (which in order to be in agreement with special relatively have to be local) but also because some experiments have already been performed [Herzog et al., 1995] where the reason why no interference pattern arises is not due to an uncontrollable disturbance of the quantum system (see also Sec. 1.1). We next introduce two requirements that are immediate consequences of Shannons postulate and in which all the probabilities that appear are welldened in quantum mechanics. We will show that the two requirements are violated by the information gained in quantum measurements. 1. Every new observation reduces our ignorance and increases our knowledge. In his work Shannon [1949] oers a list of properties to substantiate that H is a reasonable measure of information. He writes: It is easily shown that H (A B ) H (A) + H (B ) with equality only if the events are independent (i.e., p(ai bj ) = p(ai )p(bj )). The uncertainty of a joint event is less than or equal to the sum of the individual uncertainties. He continues further in the text: ... we have H (A) + H (B ) H (A B ) = H (A) + H (B |A). Hence, H (B ) H (B |A). (1.9)
The uncertainty of B is never increased by knowledge of A. It will be decreased unless A and B are independent events, in which case it is not changed (we have changed Shannons notation to coincide with that of our work). 2. Information is indierent on the order of acquisition. The total amount of information gained in successive measurements is independent of the order in which it is acquired, so that the amount of information gained by the observation of A followed by the observation of B is equivalent to the amount of information gained from the observation of B followed by the observation of A H (A) + H (B |A) = H (B ) + H (A|B ). (1.10)
This is an immediate consequence of the recursive postulate which can be obtained when we write the recursion postulate in two dierent ways
27
1 0 color 0 1
1 0
composition 00 11
1/2 white
1 0
1 0 0 1
a)
1/2 plastic
b)
1/3 white
Figure 1.6: Indierence of information to the order of its acquisition in classical measurements. A box is lled with balls of dierent compositions (plastic and wooden balls) and dierent colors (black and white balls). Now, the box is shaken. In Fig a) we rst draw a ball asking about the color of the drawn ball and gain H (color) = 1 bit of information. Subsequently, we put the black and white balls in separate boxes, draw a ball from each box separately and ask about the composition of the drawn ball. We gain Hbl (comp.) = 0 bits for the black balls and Hwh (comp.) = 1 bit for the white balls. In Fig. b) we pose the two questions in the opposite order. We rstly ask about the composition of the drawn ball and gain H (comp.) = 0.81 bit. In a conditional drawing we ask about the color of the drawn ball and gain Hwo (color) = 0 bits for wooden balls and Hpl (color) = 0.92 bits for the plastic balls. The total information gained is independent on the order of the two questions asked, i.e. H (color)+1/2Hbl (comp.)+ 1/2Hwh(comp.) = H (comp.)+1/4Hwo(color)+3/4Hpl(color) = 1.5. depending on whether the observation of A is followed by the observation of B or vice versa. An explicit example for a sequence of classical measurements is given in Fig. 1.6.
Are these two requirements satised by information gained in quantum measurements? Consider a beam of randomly polarized photons. Filters F , F and F are oriented vertically, at +45 , and horizontally respectively, and can be placed so as to intersect the beam of photons (Fig. 1.7). If we insert lter F the intensity at the detection plate will be half of the intensity of the incoming beam. The outgoing photons are now all with vertical polarization. Notice that the function of lter F cannot be explained as a sieve that only lets those photons pass that are already with horizontal polarization in the incoming beam. If that were the case, only a certain small number of randomly polarized incoming photons would be with horizontal polarization, so we would expect a much larger attenuation of the intensity of the beam as it passes the lter. Insertion of lters F and F correspond to the measurements of A polarization at +45 and B horizontal polarization, respectively. Now, when lter F is inserted behind the lter F , the intensity of the outgoing beam drops to zero. None of the photons with vertical polarization can pass through the
28
Figure 1.7: New observation (of polarization at 45 ) reduces our knowledge (of the
vertical/horizontal) polarization) at hand from a previous observation. Filters F , F and F are oriented vertically, at +45 and horizontally, respectively. If lter F is inserted behind the lter F (Fig. a), no photons are observed at the detector plate. In that case we have complete knowledge of the vertical/horizontal polarization of the photon. After lter F is inserted between F and F (Fig. b), a certain number of photons will be observed at the detection plate. Here acquisition of information about the polarization of the photon at 45 leads to a decrease of our knowledge about vertical/horizontal polarization of the photon.
horizontal lter as shown in Fig. 1.7a. In this case we have complete knowledge of the property B , i.e. H (B ) = 0. Notice that a sieve model where F (F ) only lets those photons pass that have already horizontal (vertical) polarization in the incoming beam could explain this behaviour. Now, after lter F is inserted between F and F , a certain intensity will be visible at the detection plate, exactly 1 4 of the intensity of the beam passed through F as shown in Fig. 1.7b. In that case, a certain number of photons that passed through F will also pass through F . Therefore, acquisition of information about the polarization of the photon at 45 leads to a decrease of our knowledge about horizontal polarization of the photon implying H (B |A) > 0. Consequently, 0 = H (B ) H (B |A) > 0 which clearly violates requirement (1.9). Now, imagine after F we insert the lter F in Fig. 1.7a (this, of course, does not make any essential change compared with the situation without the additional lter). We may consider this new experimental situation as a sequence of measurements BA. Now, information gained in the sequence BA in Fig. 1.7a diers from the information gained in the sequence AB in Fig. 1.7b, i.e. 0 = H (B ) + H (A|B ) = H (A) + H (B |A) > 0, thus violating the requirement (1.10). Another independent example where requirement (1.10) is violated is given in Fig. 1.8. Here we have an eect which cannot be explained by a sieve model. Classical experience suggests that the addition of a lter should only be able to decrease the intensity of the beam getting through. In a sieve model where the lter does not change the object, adding a new lter will always reduce the intensity of the beam. For completeness we note that a classical wave model can explain
29
Figure 1.8: Dependence of information on the order of its acquisition in successive quantum measurements. A spin-1/2 particle is in the state |z + spin-up along the zaxis. Spin along the x-axis and spin along the direction in the x-z plane tilted at an angle from the z-axes are successively measured, in the order in Fig. a) and opposite to that in Fig. b). Whereas we obtain an equal portion H (cos2 (/4 /2), sin2 (/4 /2)) of information in the conditional (subsequent) measurement both in Fig. a) and in Fig. 1 b), the amounts of information H (cos2 /2, sin2 /2) and H ( 1 2 , 2 ) = 1 we gain in the rst measurement in Fig. a) and in the rst measurement in Fig. b) respectively, can be signicantly dierent. Specically for 0 we have complete knowledge about spin along the direction at the angle in Fig. a) but absolutely no knowledge about the spin along the x-axis in Fig. b). We emphasize that we do not assume any specic functional dependence for the measure of information H .
the increase of the intensity of the wave transmitted through the lters. In contrast to the sieve model where adding a new lter just add some new knowledge of the object and never decrease our knowledge at hand from the previous measurements, a quantum measurement can decrease our knowledge collected from previous measurements. This originates from the distinction between maximal and complete information in quantum physics. In classical physics the maximal information about a system is complete. In quantum physics the maximal information, represented by the state vector, is never complete in the sense that all possible future measurement results are precisely dened. Yet, we do not hesitate to emphasize that it certainly is complete in the sense that it is not possible to have more information about a system than what can be specied in its quantum state. In fact, the state vector represents that information which is necessary to arrive at the maximum possible set of probabilistic prediction for all possible future observations of the system. In our explicit example the state vector of the polarization of a photon can be expressed as | = a| + b| (a and b are complex numbers) in the basis of vertical | and horizontal | polarization. The probability to observe vertical polarization is |a|2 and the probability to observe horizontal polarization is |b|2 . Measurement of vertical/horizontal polarization will change the state to an eigenstate associated with the result of the measurement. In our example if measurement by lter F results in vertical polarization, then the state changes
30
to | and when the polarization is measured again with respect to the same basis by F , it will return vertical polarization with probability one. Thus, no photon will have the property of horizontal polarization as indicated in Fig. 1.7a implying H (B ) = 0. In Fig. 1.7b, a photon passing through F with the state | will pass lter F with a probability of 1/2, and so 50% of the photons will pass through F . A photon passing through F changes the state 1 from | to | = (| + | ), indicating gain of the new knowledge 2 (about polarization at 45 ) at the expense of unavoidable and irrecoverable loss of the prior knowledge (about vertical/horizontal polarization). As before, this photon will pass F with a probability of 1/2. Thus, the probability for a photon to pass the sequence of lters F F is 1/4 implying H (B |A) = 0.56. Being a summary representation of the observers in general probabilistic predictions for future observations, the quantum state normally changes in a measurement process into one of the new states dened by the measurement apparatus. After the measurement the old summary of the observers information is at least partially lost and a new one, established to be in accord with the change of the state, is indierent to the knowledge collected from the previous measurements in the whole history of the system. Such a view was assumed by Pauli [1958] who writes6 : Bei Unbestimmtheit einer Eigenschaft eines Systems bei einer bestimmten Anordnung (bei einem bestimmten Zustand des Systems) vernichtet jeder Versuch, die betreende Eigenschaft zu messen, (mindestend teilweise) den Einu der fr uheren Kenntnisse vom System auf die (eventuell statistischen) Aussagen u ber sp atere m ogliche Messungsergebnisse.
1.2.3
A Physical Approach
A specic measure of information becomes a meaningful concept in physics only when it can be characterized by the properties which naturally follow from the physics considered. Such a property can be, for example, invariance of the total information content of the system under variation of modes of observation or conservation of the total information in time if there is no information exchange with an environment. We will show that for a quantum system the total information dened according to Shannons measure does not have these properties. The classical world appears to be composed of particles and elds, and the nature of each one of these constituents could be specied quite independently
Translated: In the case of indeniteness of a property of a system for a certain experimental arrangement (for a certain state of the system) any attempt to measure that property destroys (at least partially) the inuence of earlier knowledge of the system on (possibly statistical) statements about later possible measurement results.
6
31
of the particular phenomenon discussed or of the experimental procedure a physicist chooses. In other words, any concept introduced in classical physics is totally noncontextual. In particular, the total information content of a classical pointlike system (with no rotation and inertial degrees of freedom) dened as Shannons information associated with the probability distribution over the phase space is independent of the specic set of variables (such as position and momentum, or angle and angular momentum, etc.) considered and conserved in time if there is no information exchange with an environment7 . Operationally the total information content of a classical system can be obtained in the joint measurement of position and momentum, or in successive measurements in which the observation of position is followed by the observation of momentum or vice versa8 . In quantum physics any concept is limited to the description of phenomena taking place within some well-dened experimental context, that is, always restricted to a specic experimental procedure the physicist chooses. This implies the question: How to dene the total information content of a quantum system if in order to be in reasonable agreement with common sense it has to be invariant under variation of modes of observation and conserved in time if there is no information exchange with an environment? For a given density matrix the von Neumann entropy S ( ) = T r ( log ) (1.12)
is widely accepted as a suitable denition for an information content of a quantum system. For a system described in N -dimensional Hilbert space this ranges from log N for a completely mixed state up to 0 for a pure state. Also, the von + . That Neumann entropy is invariant under unitary transformations U U is, it is invariant under the change of the representation (basis) of and also conserved in time if there is no information exchange with an environment. However, we observe that any function9 of the form T r (f ( )) can possess these
We discuss this in detail in Appendix A.1. Here, we note that given the probability distribution (r, p, t) over the phase space the total lack of information of a classical system is dened by [Jaynes, 1962] Htotal(t) = d3 rd3 p(r, p, t) log (r, p, t) , (r, p) (1.11)
7
where a background measure (r, p) is an additional ingredient that has to be added to the formalism to ensure invariance under variable change when we consider continuous probability distributions. The conservation of Htotal in time for a system with no information exchange with an environment is implied by the Hamiltonian evolution of a point in the phase space. 8 In full analogy with (1.10) we may write Htotal(r, p) = H (r ) + H (p|r ) = H (p) + H (r|p). 9 The operator f ( ) is identied by having the same eigenstates as and the eigenvalues . f (wj ), equal to the function values taken at the eigenvalues wj of
32
properties for a suitably dened function f and can, therefore, serve as indices of the measure of the information content of a system. We also observe that the von Neumann entropy is a property of the quantum state as a whole without explicit reference to information contained in individual measurements. The question arises: How to dene and how to obtain information content of a quantum system operationally? Here we ask precisely: What set of individual measurements should we perform and how to combine individual measures of information gained in dierent individual measurements to arrive at the total information content of a quantum system? We observe that, unlike the classical case, information carried by a quantum system cannot be obtained through a set of successive measurements in a consistent way, because information gained in successive quantum measurements depends on the order of its acquisition (see Fig. 1.8 and discussion above). This suggests that any attempt to obtain the total information content of a quantum system has to be related to the specic set of dierent possible experiments performed on an ensemble of equally prepared systems. For a quantum system in the state dierent experiments correspond to dierent probabilities for possible outcomes and therefore to dierent Shannon information. How are individual measures of information obtained in dierent individual experiments related to the total information carried by a quantum system? It can be shown that the optimal experiment, which minimizes Shannons information, is the one which corresponds to the orthonormal basis |i formed by the eigenvectors of the density matrix : |i = wi |i . The corresponding Shannon information is then equal to the von Neumann entropy, i.e. H =
i
wi log wi = T r ( log ).
(1.13)
Clearly this is invariant under unitary transformations. Again this implies invariance of H under the change of the representation basis of and also its conservation in time if there is no information exchange with an environment. That is, if we perform the optimal experiments both at time t0 and at some future time t, the Shannons information measures associated to the optimal experiments at the two times H (t) =
i
wi log wi = H (t0 )
(1.14)
will be the same. Here, the eigenvalues of the density matrix at time t are wi (t). However, without the additional knowledge of the eigenbasis of the density
33
matrix we cannot nd the optimal experiment and determine experimentally the Shannon information associated. Also, all the statistical predictions that can be made for the optimal measurement are the same as if we had an ordinary (classical) mixture, with fractions wi of the systems giving with certainty results that are associated to the eigenvectors |i . In this sense the optimal measurement is a classical type measurement and therefore in this particular case, and only then, Shannons measure denes the information gain in a measurement appropriately. It is thus not surprising that Shannons measure is useful only when applied to measurements which can be understood as classical measurements. Again the question arises: How to combine individual measures of information obtained in dierent individual measurements in order to arrive at the information content of a quantum system if the individual measurements are incompatible with the density operator (non-optimal measurements)? One may be tempted to dene the total information content of a quantum system in a constructive fashion, namely as a sum of individual measures of information over a complete set of mutually complementary experiments. These are experiments with the property that complete knowledge of the outcome in one of the experiments excludes any knowledge of the outcomes in others. For example, a set of measurements of (1) vertical/horizontal polarization, (2) polarization at +45 /45 , and (3) left/right circular polarization is a complete set of mutually complementary measurements for photons polarization. Consider a photons polarization state | = cos | + sin | . We summarize individual measures of Shannons information for the mutually complementary observations (1), (2) and (3) and obtain
Htotal = H1 + H2 + H3 = cos2 log cos2 + sin2 log sin2 2 2 2 2 1 sin 1 sin 1 + sin 1 + sin + log + log 2 2 2 2
(1.15)
for the total Shannon information carried by the photons polarization. Our result clearly depends on the parameter and thus is not invariant under unitary transformations. This further associates certain features with our candidate Htotal for the total information carried by the photons polarization that strongly disagrees with our intuitive appeal. Firstly, Htotal is not equal for each polarization state of the same purity. Secondly, Htotal is not specied by the polarization state alone but depends on the particular set of mutually complementary observations. If we choose another set of mutually complementary observations, e.g., (1) polarization along the direction at an angle with respect to the vertical direction, (2) polarization along the direction at an angle
34
( + 45 ) with respect to the vertical direction, and (3) left/right circular polarization, the total information carried by photons polarization might not be the same (it depends on the particular value of the angle ). And thirdly, Htotal is not conserved in time for a system isolated from its environment completely. In this section we have stressed some conceptual diculties arising when we apply Shannons notion of information to dene information gain in a quantum measurement. Investigating three dierent approaches to the concept of Shannons information we argued that these diculties arise whenever it is not possible, not even in principle, to assume that attributes observed are assigned to the system before the observation is performed. The question arises: Are there other physical situations where the use of Shannons measure of information might be justied in quantum mechanics? Obviously, there are. Suppose that there is a set of dierent possible preparations of the initial state and that the a priori probabilities for the dierent preparations are known to the observer. The observer is not told which one of the states is actually implemented. Suppose now that the observer wants to determine the actual state. Here the observers ignorance about the possible prepared states can be quantied by Shannons measure of information because the possible states, in principle, can be thought of as being objectively present before the measurement is performed. We briey review an explicit example analyzed by Peres [1995]. Let n1 , n2 and n3 denote three unit vectors dened in a plane separated by angles of 120 . Consider a spin-1/2 particle and dene normalized states |i by ni |i = |i (i=1,2,3). The spin-1/2 particle can be prepared in one of three states |i dened above, and these three preparations have equal a priori probability, i.e H = log 3. Which one of these states is actually prepared? Since the states are not orthogonal the answer cannot be unambiguous. The procedure giving the maximal possible information (that is reducing H as much as possible) is obtained in a POVM (positive-operator-valued measurement) by ruling out one of the three allowed states, and leaving equal a posteriori probabilities for the two others. The value of H is reduced to log 2 = 1 , so that the actual information gain is log(3/2).
35
1.3
Quantum mechanics is an intrinsically probabilistic description of Nature. All an experimentalist can know before a quantum experiment is performed are the probabilities for all possible outcomes to occur. In general, which specic outcome occurs is objectively random. We dene a new measure of information for an individual measurement which is based on the fact that the probabilistic predictions an experimentalist can make have no empirical signicance for any individual experiment but only as predictions about the number of occurrences of a specic outcome in future repetitions of the experiment. Consider a stationary experimental arrangement with two detectors, where only one detector res at a time, i.e. in each experimental trial. Detector 1, say, res (we call this the yes outcome) with probability p. If it does not re (the no outcome) the other detector will re with probability q = 1 p. When exactly one detector has red, the experiment is over. Examples would be the Stern-Gerlach experiment with a spin-1/2 particle or an interference experiment with an interferometer of the Mach-Zehnder type. Knowing the probabilities for the two outcomes to occur all an experimenter can predict is how many times a specic detector res. In making her prediction she has only a limited number of systems to work with. Then, because of the statistical uctuations associated with any nite number of experimental trials, the number of occurrences of a specic outcome in N future repetitions of the experiment is not precisely predictable. In N independent experimental trials, the particular ordered sequence of results yes,no,no ... yes containing yes exactly n times and no exactly N n times occurs with probability p (1 p) (1 p) p = pn (1 p)N n . (1.16)
The various dierent permutations of the sequence are independent events, and so we can add their probabilities to obtain10 N n pn (1 p)N n ,
PN (n) =
(1.17)
10 We are ignorant about dierent possible orders of individual outcomes because, in quantum measurement the particular order of individual outcomes is not dened before the experiment is performed. In contrast, classical measurements reveals pre-existing properties of individual systems and therefore the particular sequence of individual outcomes is of importance. Information that is gained about a particular sequence observed is adequately dened by Shannons measure of information (see Sec. 1.2).
36
the probability that from N independent experimental trials we observe n times yes and N n times no. This is known as the binomial distribution [Gnedenko, 1976]. Note that if one bets on a specic result, e.g. that the number of yes outcomes will be the one with highest probability, which is nmax pN , the probability of success still depends on p. With an inner probability of p = 0.5, the probability of 5 yes outcomes in 10 trials is only 0.25, but with one where p = 0.9 the probability of 9 yes outcomes in 10 trials is 0.39. It is a peculiar feature of the binomial distribution, that the future number of occurrences is less specied when p is around 0.5. An experimenters uncertainty11 , or lack of information, in the value n is given by the mean-square-deviation dened as the expectation of the square of the deviation of n from the mean value pN [Gnedenko, 1976]
N n=1
2 :=
(1.19)
In fact, if is small, then each term in the sum in Eq. (1.19) is small. A value n for which |n pN | is large must therefore have a small probability PN (n). In other words, in the case of small , large deviations of the number of occurrences of the yes outcome from the mean pN are improbable. In this case an experimenter knows the future number of occurrences with a high certainty. Conversely, a large variance indicates that not all highly probable values of n lie near the mean pN . In that case experimenter knows much less about the future number of occurrences. For a suciently large number N of experimental trials, the condence interval within which the number of occurrences of the yes outcome can be found in 68% of cases is given as [Gnedenko, 1976] (pN , pN + ). (1.20)
Therefore, if an observer just plans to perform the experiment N times, he knows in advance, before the experiments are performed and their outcomes
Since the binomial distribution has a nite deviation, it fullls Chebyshevs inequality [Gnedenko, 1976]: Prob{|n pN | > k } 1 . k2 (1.18)
11
This inequality means that the probability that n will deviate from the product pN by more than k deviations is less than or equal to 1/k2 . The arbitrary condence parameter k only makes sense for k > 1. This inequality is the strongest one possible for probability distributions having a nite deviation, although more stringent ones can be given for the present case of the binomial distribution.
37
Figure 1.9: The probability to observe n occurrences of the yes outcome in future
N repetitions of the experiment as a function of n. In a suciently large number N the condence interval within which the number of occurrences of the yes outcome can be found in 68% of the cases is given by root-mean-square-deviation .
become known, that the number of future occurrences of the yes outcome will be found with probability 68% within the condence interval (1.20) (see Fig. 1.9). Consider now an experiment with three possible outcomes a1 , a2 and a3 whose probabilities of occurrence are p1 , p2 , and p3 . There is a method by which an observer can decompose this 3-fold alternative into binary alternatives and then apply the measure of information (1.19). He could, for example, consider the two outcomes a1 and a2 as one single outcome a1 a2 that occurs with probability p1 + p2 and the outcome a3 as an exclusive outcome that occurs with probability p3 = 1 p1 + p2 . The observer may now rst ask for the number of occurrences of the outcome a1 a2 (a3 ) in N future experimental trials, and then whenever the outcome a1 a2 occurs, further ask for the number of occurrences of the outcome a1 (a2 ). In the rst phase of this method the measure of the experimentalists lack of information about the number of future occurrences of the outcome a1 a2 in N experimental trials is given by 2 (a1 a2 , a3 ) = (p1 + p2 )p3 N. For the cases in which the second phase of the method must be carried out, a further lack of information can be expected. Namely, when the outcome a1 a2 occurs the measure of experimentalists lack of information about the number of occurrences of a1 (a2 ) given the outcome a1 a2 did occur is 2 (a1 , a2 |a1 a2 ) = p2 p1 N (p1 + p2 ). p1 + p2 p1 + p2
38
Note that the conditional probabilities for a1 and a2 given that a1 a2 did p2 1 occur are p1p +p2 and p1 +p2 respectively. Also note that the number of future experimental trials where the second phase of the method has to be carried out is (p1 + p2 )N . The second phase of the method is conditional and is only expected to occur a fraction p1 + p2 of the time. Thus, in total, the expected measure of the experimentalists lack of information UN (a1 , a2 , a3 ) with respect to the number of occurrences of the three outcomes in N future repetitions of experiment is UN (a1 , a2 , a3 ) = 2 (a1 a2 , a3 ) + (p1 + p2 ) 2 (a1 , a2 |a1 a2 ) = (p1 p2 + p1 p3 + p2 p3 )N. This can easily be generalized for n possible outcomes a1 ,..., an whose probabilities to occur are p = (p1 , ..., pn ) to
n
(1.21)
UN (a1 , ..., an ) =
i<j
pi pj N.
(1.22)
Notice that the experimenters lack of information (1.22) is proportional to the number of experimental trials. This property guarantees that each individual performance of the experiment contributes the same amount of information gain, no matter how many times the experiment has already been performed. After each experimental trial, the experimenters lack of information therefore decreases by the same amount UN (a1 , ..., an ) = N
n n 1 1 p2 i 2 i=1
U (p) :=
pi pj =
i<j
(1.23)
of information. This is the lack of information with respect to a single future experimental trial. The denition of U (p) suggests that the knowledge, or information, with respect to a single future experimental trial an experimentalist possesses before the experiment is performed is somehow the complement of U (p) and, furthermore, that it is a function of a sum of the squares of probabilities. A rst ansatz might be
n
I (p) =
i=1
p2 i.
(1.24)
39
Expressions of such a general type were studied in detail by Hardy et. al [1952] and Unk [1990] who list several properties to substantiate that I (p) is a reasonable measure of information. For example, 1. I (p) is invariant under a re-labeling of the set of possible outcomes. 2. I (p) is continuous in pj and attains its maximal value of unity only when all the pj but one are zero, this one having the value unity. The minimal value of 1/n is reached only when the probability is uniformly distributed over all possible outcomes, i.e. when pj = 1/n for all j . 3. If two ensembles specied by probability distributions p = (p1 , ..., pn ) and q 2 = (q1 , ..., qn ) are joined together (in mathematical language this is described by a convex combination rj = apj + (1 a)qj , 0 a 1), one loses the information from which ensemble a particular sample comes, and therefore the measure of information decreases I (r ) aI (p) + (1 a)I (q ). (1.25)
4. When p = (p1 , ..., pn ) and q = (q1 , ..., qm ) are two probability distributions and r = (r11 , ..., rnm ) is their independent product, i.e. rij = pi qj , one has I (r ) = I (p)I (q ). (1.26)
M (p) =
i=1
p i
for
(1.27)
that from the point view of information theory all can be assumed to quantify information properly. Our newly introduced measure of information I (p) may then be seen as a specic case for = 2. These expressions are also related to R anyis [1961] entropy H (p) = log M (p) =
n 1 p log i. 1 i=1
(1.28)
(1.29)
40
and the minus logarithm of our newly proposed measure of information log I (p) = H2 (p) (1.30)
may then be seen as special cases of R anyis entropy. Also, note that if, instead of I (p), the expression log I (p) is alternatively seen as an adequate measure of uncertainty in an individual measurement, then the uncertainty about a joint outcome associated to the product distribution rij = pi qj (see above in 4.) is the sum of the uncertainties about mutually independent outcomes log I (r ) = log I (p) log I (q ). Notice that the expression (1.24) can also be viewed as describing the length of the probability vector p. Obviously, because of i pi = 1, not all vectors in probability space are possible. Indeed, the minimum length of p is given when all pi are equal (pi = 1/n). This corresponds to the situation of complete lack of information in an experiment. Therefore we suggest to normalize the measure of information in an individual quantum measurement as obtaining nally [Brukner and Zeilinger, 1999(a)]
n i=1
I (p) = N
pi
1 n
(1.31)
Considering from now on those cases where maximally k bits of information can be encoded, i.e. n = 2k , the normalization is N = 2k k/(2k 1), i.e. 2k k k 2 1
2k i=1 2
I (p) =
pi
1 2k
(1.32)
Then I (p) results in k bits of information if one pi = 1 and it results in 0 bits of information when all pi are equal. Such a normalized measure of information is directly related to the Euclidean distance between the actual probability distribution vector (p1 ,...,pn ) and the uniform distribution vector (1/n,...,1/n) in probability space. This is justied since the uniform distribution gives no information whatever for preferring one outcome over another. Specically, we will consider the cases for n = 2 and n = 4 when maximally 1 bit and 2 bits of information are encoded, respectively. Then,
2
I (p) = 2
i=1
2 p2 i 1 = (p1 p2 )
for n = 2,
(1.33)
41
p2 i
2 3
for n = 4.
(1.34)
Quantum mechanics is an intrisically probabilistic description of nature. All that is known before an experiment is performed are the probabilities for all possible events to occur. Given the probabilities for events to occur, the concept of a probabilistic prediction of the future number of occurrences of outcomes is all an experimenter can predict in the open future which, in quantum mechanics, is fundamentally not precisely predictable12 . This we interpret as strong evidence for the adequacy of our measure of information in dening the information gain in quantum measurements. Most important, as we will show in the next section, the total information in a quantum system obtained by summing the individual measures of information over all possible complementary experiments turns out to be invariant under any change of the set of complementary observations, and it is conserved in time if there is no information exchange with the environment.
Here, a very subtle position was assumed by Weizs acker [1975] who writes: It is most important to see that this [the fact that probability is not a prediction of the precise value of the relative frequency] is not a particular weakness of the objective empirical use of the concept of probability, but a feature of the objective empirical use of any quantitative concept. If you predict that some physical quantity, say a temperature, will have a certain value when measured, this prediction also means its expection value within a statistical ensemble of measurements. The same statement applies to the empirical quantity called relative frequency. But here are two dierences which are connected to each other. The rst dierence: In other empirical quantities the dispersion of the distribution is in most cases an independent empirical property of the distribution and can be altered by more precise measurements of other devices; in probability the dispersion is derived from the theory itself and depends on the absolute number of cases. The second dierence: In other empirical quantities the discussion of their statistical distributions is done by another theory than the one to which they individually belong, namely by the general theory of probability; in probability this discussion evidently belongs to the theory of this quantity, namely of probability itself. The second dierence explains the rst one.
12
42
Chapter 2
Itotal =
Ij (p)
j =1
(2.1)
to be the total information content of a quantum system. For example, for a spin-1/2 particle, measurements of three orthogonal components of spin form a complete set of mutually complementary observations. These observations then completely exclude each other. One quantity, for example the z-component of spin, might be well dened at the expense of maximal uncertainty about the other orthogonal components. The experimentalist may decide to measure a dierent set of complementary variables thus gaining certainty about one or more variables at the expense of losing certainty about other(s). In the case of spin this could be the projections along rotated directions, for example where the uncertainty in one component is reduced but the one in another component is increased correspondingly. Intuitively one expects that the total uncertainty or, equivalently, the total information carried by the system is invariant under such transformation from 43
44
one complete set of complementary variables to another. In Sec. 2.1 we show that the total information dened according to our new measure has exactly that invariance property. Also it is conserved in time if there is no information exchange with an environment. In Sec. 2.3 we nd that the total information of a system results in k bits of information for a system consisting of k qubits. Entanglement results from the fact that for composite systems, information could also be distributed in joint properties of a multiparticle system (Sec. 2.2). In particular, maximal entanglement arises when the total information of a composite system is exhausted in specifying joint properties with no individual qubit carrying any information on its own.
2.1
A qubit is a quantum bit, the analogue of the ordinary bit, e.g., a computers binary digit 1 or 0, or heads or tails in tossing a coin. The qubit is described by an unit vector in a two-dimensional Hilbert space with a chosen computational basis {|0 , |1 } (e.g. two polarization states of a photon or a spin 1/2-particle, or two paths inside a Mach-Zehnder interferometer). The classical Boolean state, 0 and 1, can therefore be represented by a pair of orthogonal states |0 and |1 . Unlike classical bits however, qubits can also be in a superposition of |0 and |1 such as a|0 + b|1 where a and b are complex numbers. Even though qubits can be put in a quantum superposition, thus having innitely many more states, we will show that one qubit carries just one bit of information. That information content of the qubit will be obtained as a sum of individual information contents in a set of mutually complementary measurements. The amount of 1 bit of information for the total information content will be invariant under the choice of the particular set of mutually complementary measurements and conserved in time if there is no information exchange with an environment.
2.1.1
Complementary Propositions
Every reasonably well-designed experiment tests some proposition. Knowledge of the state of a quantum system permits the prediction of individual outcomes with certainty only for that limited class of experiments which have denite outcomes, a situation where the corresponding propositions have denite truth values. To illustrate this point, let us consider the state | which is an eigen-
45
Figure 2.1: A complete set of mutually exclusive measurements of a photons polarization. They are associated to the propositions: The polarization of the photon is vertical, The polarization of the photon is +45 and The polarization of the photon is left circular. A photon from the source passes through the vertically oriented polarizer and trough a Nicol prism oriented at the angle with respect to the vertical direction, and then it hits one of the detector plates behind the Nicol prism. Alternativelly, in Fig. c), a quarter-wave plate is inserted in front of the Nicol prism to observe circular polarization.
= | | with eigenvalue 1, that is | | . state of the projection operator P This simply means that the quantum system described by the state will be found with certainty in the state | if it is measured with an appropriately designed apparatus. What about the truth values of other propositions? From theorems like Kochen-Specker [1967] we know that in quantum mechanics it is not possible, not even in principle, to assign denite non-contextual truth values to all conceivable propositions. Consider encoding information into the polarization of photons and suppose that a photon (by passing through a vertically oriented polarizing lter in Fig. 2.1) has vertical polarization. The information content of the photon can then be expressed as a truth value of the proposition: (1) The polarization of the photon is vertical. This permits the prediction of individual outcomes with certainty only for the case when we actually perform a measurement in the vertical-horizontal polarization basis. Measurement for any other measurement basis when the Nicol prism is rotated to the angle with respect to the vertical axis must necessarily be probabilistic. Then each individual photon, when it encounters the Nicol prism, has exactly two probabilistic options: to pass straight through the prism or to be deected in a specic way characteristic of the prism. Quantum theory predicts p( ) = cos2 (/2) for the probability to nd the photon with vertical polarization (the upper detector plate is hit) along the direction at an angle of the orientation of the Nicol prism (Fig. 2.1). Specically, for = 45 , we obtain p = 1/2, that is, the answer the experiment gives when we measure along that direction is completely random (Fig. 2.1b). In
46
this extreme case we have absolutely no knowledge about the truth value of the proposition (2): The polarization of the photon is +45 . If we introduce an additional quarter-wave plate in front of the Nicol prism we will observe circular polarization and again the answer will be completely random (Fig. 2.1c). In that case we are completely uncertain about the truth value of the proposition (3): The polarization of the photon is left circular. The three mutually complementary propositions for the photons polarization have the property that complete knowledge of the truth value of any one of the propositions implies maximal uncertainty about the truth values of the others. In what follows, we will analyze mutually complementary propositions in an interference experiment with an idealized Mach-Zehnder type of interferometer (Fig. 2.2). We assume the two mirrors used for splitting and recombining the beams to be identical half-silvered mirrors. Also, we assume equal path lengths between the mirrors and no absorption of either in-interferometer beam. Suppose that in the presence of a specic phase shift between the two beams inside the interferometer in Fig. 2.2a, the particle will exit with certainty towards the upper (lower) detector behind the beam splitter. In this case we have complete knowledge of the beam the particle will be found in behind the beam splitter at the expense of the fact that we have absolutely no knowledge of which of the two paths the particle took inside the interferometer. The state of the particle is then represented by the truth value (true or false) of the proposition: The particle takes the outgoing path towards the upper detector in the presence of the phase shift . In contrast, if we insert detectors, being able to detect the particle without absorbing it, one each into each of two paths inside the interferometer (Fig. 2.2b), we may have information about which path the particle took inside the interferometer, without absorbing the particle. The interference phenomena then does not occur. In that case we know precisely the path of the particle inside the interferometer, at the expense of complete uncertainty of the outgoing beam the particle takes. Notice that both outgoing beams will be equally probable regardless of the phase shift between the two beams inside the interferometer. The state of the particle is now specied by the truth value of the proposition: The particle takes the upper path inside the interferometer. Knowing that spin-1/2 aords a model of the quantum mechanics of all two-state systems, i.e qubits, we expect that there are always three mutually complementary propositions whenever binary alternatives are considered. Indeed, it can easily be shown that even without path information our knowledge of the beam the particle will be found in behind the beam splitter in the arrangement in Fig. 2.2a will be completely removed if we introduce an additional
47
Figure 2.2: Principle sketch of mutually exclusive interference experiments with a Mach-Zehnder type of interferometer. They are associated with the following set of mutually complementary propositions: The particle takes the outgoing path towards the upper detector in presence of the phase shift of , The particle takes the upper path inside the interferometer, and The particle takes the outgoing path towards upper detector in presence of the phase shift of + /2. Into each of the two paths of the interferometer in Fig. 2.2b one detector is inserted with the property that it detects the particle without absorbing it.
phase shift of /2 between the two beams inside the interferometer. Then, in the new arrangement in Fig. 2.2c both outgoing beams will be equally probable. Next, suppose that in the presence of a specic phase shift + /2, the particle will exit with certainty towards the upper (lower) detector in the arrangement in Fig. 2.2c. The state of the system is now represented by the truth value of the proposition The particle takes the outgoing path towards the upper detector in the presence of the phase shift of +/2. For a particle in that state we have complete knowledge of the outgoing beam the particle will take at the expense of absolutely no knowledge neither about the path inside the interferometer (Fig. 2.2b) nor about the outgoing path in the arrangement in Fig. 2.2a. Notice that we can label various sets of the 3 mutually complementary propositions by the value of the phase shift: (1) The particle takes the outgoing path towards the upper detector in presence of the phase shift of , (2) The particle takes the upper path inside the interferometer, and (3) The particle takes the outgoing path towards the upper detector in presence of the phase shift of + /2. The 3 propositions we found for the interferometer are formally equivalent to the complementary propositions about spin-1/2: (1)The spin is up along in the x-y plane, (2) The spin is up along the z-axis, and (3) The spin is up along + /2 in the x-y plane. Here, the direction is
48
assumed to be by lying in the x-y plane oriented at an angle with respect to the x-axis. Evidently, this analogy can even be carried further using the concept of multiports. Therefore from now on we will explicitly discuss spin measurements only keeping in mind the applicability of these ideas for interference experiments.
2.1.2
The situations discussed so far are just extreme cases of maximal knowledge of one proposition at the expense of complete ignorance about complementary propositions. Wootters and Zurek [1979] found that measurements can be made to determine with high probability which slit each particle of the ensemble traversed, without completely destroying the interference pattern formed by the impingement of the particles on a screen. They found for the double slit experiment and Zeilinger [1986], Greenberger and Yasin [1988], Jaeger et al., Shimony and Vaidman [1995], Mittelstaedt at al. [1987], Lahti et al. [1991], and Englert [1996] for the interferometer, that we can obtain some partial knowledge about the particles path and still observe an interference pattern of reduced contrast as compared to the ideal interference situation. In related experiments, Summhammer et al. [1982] and Rauch and Summhammer [1984] measured the interference contrast in an experiment with a partial absorber in one of the beams inside the interferometer and obtained full agreement with the quantum mechanical description. In order to quantify this intermediate situation of partial knowledge of three mutually exclusive types of information we make use of the new measure of information (1.33) introduced in Sec. 1.3. For the purpose of calculating probabilities for all possible outcomes of a complete set of mutually complementary quantities, it is most useful to use the basis of the two beams |UP and |LP taking the upper and lower path inside the interferometer, respectively. The evolution of the kets |UP and |LP is given by |UP |LP 1 ei |UP ei (|D2 + i|D1 ) 2 1 (|D1 + i|D2 ), 2
(2.2)
respectively, where the ket |D1 denotes the particle directed towards detector D 1, etc. This is realized by a phase shift of in path 2 and a successive transformation at a symmetric half-silvered mirror beam splitter. Consider a system in a state represented in the basis of two paths |UP
2.1 One Qubit Carries One Bit and |LP inside the interferometer 11 = UP||UP , 12 = UP||LP , 21 = LP||UP , 22 = LP||LP .
49
(2.3)
If we perform an experiment in Fig. 2.2b to determine the path the particle takes, the particle will either be found in the upper path with probability p+ z = 11 or in the lower path with probability pz = 22 . The measure of information (Eq. 1.33) about the path the particle takes is
2 2 I3 = (p+ z pz ) = (11 22 ) .
(2.4)
Here, the specic notation for probabilities should remind us of the formal analogy with the spin case. Using the evolution transformations (2.2) the probability to nd the particle in a specic beam behind the beam splitter in the experimental arrangement in 1 Fig. 2.2a may easily be calculated as p+ x = 2 (1 + 2|21 | sin( + )) for the beam 1 D1 and px = 2 (1 2|21 | sin( + )) for the beam D 2. Here, we introduce 21 = |21 |ei . Now, the measure of information about the beam the particle will be found in behind the beam splitter yields
2 2 2 I1 = (p+ x px ) = 4|21 | sin ( + ).
(2.5)
Likewise, inserting + 2 in the previous expression we obtain the measure of information about the beam the particle will be found in behind the beam splitter in the experimental arrangement of Fig. 2.2c. We nd
2 2 2 I2 = (p+ y py ) = 4|21 | cos ( + ).
(2.6)
We realize that the total information content of the system is [Brukner and Zeilinger, 1999(a), 1999(c)]
+ + 2 Itotal = I1 (p+ x , px ) + I2 (py , py ) + I3 (pz , pz ) = 2T r 1.
(2.7)
This results in just 1 bit of information for a pure state when 1 single proposition with a denite truth value is assigned to the system. Likewise, 0 bits of information are reached for a completely mixed state when no proposition with denite truth value can be made about the system. We interpret expression (2.7) as implying the fundamental and absolute upper limit on our total information about an individual system in quantum mechanics.
50
Note that the total information content of a quantum system is completely specied by the state of the system alone and independent of the physical parameter (phase shift) that labels various sets of mutually complementary observations. In the same spirit as choosing a coordinate system, one may choose any set of mutually complementary propositions to represent our knowledge of the system and the total information about the system will be invariant under that choice. This is the reason we may use the phrase the total information content without explicitly specifying the particular reference set of mutually complementary propositions. Our consideration will now be generalized to show that the total information content of a quantum system is invariant under general unitary transformations. We will use the formal equivalence with spin measurements. Since any Hermitian operator on a discrete two-dimensional Hilbert space can be expanded into the unit operator and the Pauli matrices1 x , y and z , we can specify any density operator by the coecient of these generators. The Pauli matrices fulll the relations T r j = 0 and T r i j = 2ij , (2.8)
where j = x, y, z . Also, observations associated with the three spin operators are mutually complementary. That is, e.g., if the system is in the state = + Px = |x+ x + |, measurement results for spin along the y-axis and for spin along the z-axis will be completely random. This results in 1 x T rP Py = , 2 1 x Pz = , T rP 2 1 y Pz = . T rP 2 (2.9)
The density operator then has the representation 1 1 3 = I + sj j , 2 2 j =1 where the factor 1/2 expresses the normalization condition T r = 1 and
sj = T r j = p+ j pj .
(2.10)
(2.11)
is the mean-value of the spin component j . The vector s = (sx , sy , sz ) is the three-dimensional Bloch vector, or the vector on the Poincare sphere, which is real due to the hermicity of .
Any Hermitian operator on a discrete N-dimensional Hilbert space can be expanded into the unit operator and the generators of the SU(N) algebra. In the case N = 2 the generators of SU(2) algebra are operators represented by the Pauli matrices.
1
51
Note that the square of the component sj of the Bloch vector is exactly the measure of information Ij (p+ j , pj ) about the experimental outcome of the measurement of the spin along direction j , i.e.
2 + (sj )2 = (T r j )2 = (p+ j pj ) = Ij (pj , pj ).
(2.12)
T r 2 =
si sj T r i j
i=1 j =1 =2ij
1 1 3 2 1 1 3 1 1 sj = + Ij (p+ + + Itotal , j , pj ) = 2 2 j =1 2 2 j =1 2 2
(2.13)
or equivalently
+ + 2 Itotal () = Ix (p+ j , pj ) + Iy (pj , pj ) + Iz (pj , pj ) = 2T r 1.
(2.14)
+ y U + and U + to the If we apply an unitary transformation U z U xU , U Pauli matrices, the resulting operators fulll relations (2.8) and (2.9) again. That is, the resulting operators again represent mutually complementary observables and are also generators of the SU(2) algebra. We may then represent the density matrix into the new set of operators and the unity operator 1 1 3 +. = I j U s U + 2 2 j =1 j Here, clearly + )2 . (sj )2 = (T r U j U (2.16)
(2.15)
The ability to decompose the same density matrix into various sets of such operators implies that the total information of the system is completely independent of the particular set of mutually complementary observations (represented by + y U + and U + ) considered, i.e. a particular set of operators U z U xU , U
3 j =1 3 j =1
Itotal = 2T r2 1 =
s2 j =
sj2 .
(2.17)
52
Now, let (t0 ) be the initial state of the system. A time evolved state at + , where U is an unitary transformation. some future time is (t) = U (t0 )U Since +U + = T rU + = T r T r 2 (t) = T r U (t0 )U U 2 (t0 )U 2 (t0 ) (2.18)
we conclude that the total information content of the system is conserved in time if there is no information exchange with the environment, that is, if the system is dynamically independent from the environment and not exposed to a measurement. A system is dynamically independent from the environment when the inuence of the environment on the system does not change with a change of the state of the system (there is no back-reaction from the system on the environment). A concrete example for dynamically dependent particles are interacting single particles in an entangled pair of particles (see Sec. 3.5).
53
2.2
The total information content of a composite system consisting of two qubits is 2 bits of information. That information is invariant under the choice of the particular set of mutually complementary observations and conserved in time if there is no information exchange with an environment. Entanglement results from the fact that information could also be distributed in joint properties of a multiparticle system. In particular, maximal entanglement arises when the total information of a composite system is exhausted in specifying joint properties.
2.2.1
In contrast to the single-qubit system where an individual proposition denes the information content of one bit of information, for a two-qubit system a pair of propositions will dene the information content of two bits of information. Again, we will consider sets of mutually complementary pairs of propositions where precise knowledge of the truth values of a specic pair of propositions excludes any knowledge of the truth values of other complementary pairs of propositions. As opposed to the single-particle case where 3 individual propositions are complementary to each other, in the two-particle case we have 5 pairs of propositions where each pair is complementary to each other pair2 . In order to analyze a simple composite system in view of the ideas proposed above, let us consider setups where a source emits two particles which feed two Mach-Zehnder interferometers as shown in Fig. 2.3. Phase shifters in both interferometers permit continuous variation of the phases 1 and 2 . We give one possible choice of a complete set of complementary propositions for the two particles: (A) Particle 1 takes the upper path inside the interferometer and Particle 2 takes the upper path inside the interferometer; (B) Particle 1 takes the upper outgoing path in presence of the phase shift 1 and Particle 2 takes the upper outgoing path in presence of the phase shift 2 ; (C) Particle 1 takes the upper outgoing path in presence of the phase shift 1 + /2 and Particle 2 takes the upper outgoing paths in presence of the phase shift 2 + /2;
An exact number of mutually complementary observables with respect to the dimension of the Hilbert space of the system will be discussed in Sec. 2.3. Here we note that for a composite system consisting of k qubits, there are 2k + 1 mutually complementary observables.
2
54
(D) The path of particle 1 inside the interferometer and the outgoing path of particle 2 in the presence of the phase shift 2 are the same and Particle 1 and particle 2 take the same outgoing path in the presence of the phase shifts 1 and 2 + /2; (E) The path of particle 1 inside the interferometer and the outgoing path of particle 2 in the presence of the phase shift 2 + /2 are the same and Particle 1 and particle 2 take the same outgoing path in the presence of the phase shifts 1 + /2 and 2 . Here, for example, in propositions (D) and (E) the phrase The path of particle 1 ... and the outgoing path of particle 2 ... are the same precisely means that particle 1 and particle 2 either both take the upper path (particle 1 the upper path inside the interferometer and particle 2 the upper outgoing path) or both take the lower path (particle 1 the lower path inside the interferometer and particle 2 the lower outgoing path). Analogous to the single-particle case, dierent sets of mutually complementary pairs of propositions are labeled by parameters 1 and 2 . In a set of mutually complementary spin measurements for two spin-1/2 particles these parameters would correspond to the angles of rotation of the magnets in the Stern-Gerlach apparati for particle 1 and for particle 2. A formally equivalent
55
set of mutually complementary pairs of propositions includes spin measurements on particle 1 and particle 2 along directions 1 and 2 assumed to be by lying in the x-y plane oriented at angles 1 and 2 with respect to the x-axis [Brukner and Zeilinger, 1999(d)]: (A) The spin of particle 1 is up along z and The spin of particle 2 is up along z; (B) The spin of particle 1 is up along 1 in the x-y plane and The spin of particle 2 is up along 2 in the x-y plane; (C) The spin of particle 1 is up along 1 + 90 in the x-y plane and The spin of particle 2 is up along 2 + 90 in the x-y plane; (D) The spin of particle 1 along z and the spin of particle 2 along 2 in the x-y plane are the same and The spin of particle 1 along 1 in the x-y plane and the spin of particle 2 along 2 + 90 in the x-y plane are the same; and (E) The spin of particle 1 along z and the spin of particle 2 along 2 + 90 in the x-y plane are the same and The spin of the particle 1 along 1 + 90 in the x-y plane and the spin of the particle 2 along 2 in the x-y plane are the same. We now consider the states dened by each pair of propositions. Four product states |z + 1 |z + 2 , |z + 1 |z 2 , |z 1 |z +
2
and |z 1 |z
(2.19)
represent the two-bit combinations true-true, true-false, false-true and falsefalse of the truth values of the pair of propositions given in (A) above. Similarly, the product states |x+ 1 |x+ 2 , |x+ 1 |x 2 , |x 1 |x+ and |y + 1 |y + 2 , |y + 1 |y 2 , |y 1 |y +
2 2
and |x 1 |x
(2.20)
and |y 1 |y
(2.21)
represent the two-bits combination true-true, true-false, false-true and falsefalse of the propositions given in (B) and (C) respectively. Here |x+ , |x and |y + , |y are eigenbases of spin rotated by 1 and 2 separately. We emphasize
56
that the three sets of product states in Eq. (2.19), (2.20) and (2.21) might be seen as eigenbases of three mutually complementary pairs of observables. Then, if the composite system is in the eigenstate of one of the pairs of observables, the probabilities to nd the system in eigenstates of complementary pairs of observables will be equal (having a value of 1/4). If instead of describing the composite system with propositions about properties of individual particles we consider propositions describing joint properties of the composite system, we obtain four Bell states (notice that they are not usual Bell-states): 1 1 |1 = (|z + 1 |x+ 2 + i|z 1 |x 2 ) = (|x+ 1 |y + 2 + i|x 1 |y 2 ), 2 2 1 1 |2 = (i|z + 1 |x+ 2 + |z 1 |x 2 ) = (|x+ 1 |y 2 i|x 1 |y + 2 ), 2 2 1 1 |3 = (i|z + 1 |x 2 + |z 1 |x+ 2 ) = (|x+ 1 |y + 2 i|x 1 |y 2 ), 2 2 1 1 |4 = (|z + 1 |x 2 + i|z 1 |x+ 2 ) = (|x+ 1 |y 2 + i|x 1 |y + 2 ). 2 2 These four Bell states now are the representation of the four possible two-bit combinations true-true, true-false, false-true and false-false of the truth values of the pair of propositions given in (D). Similarly, the four Bell states
1 1 |1 = ei 4 (|z + 1 |y + 2 + |z 1 |y 2 ) = (|y + 1 |x+ 2 i|y 1 |x 2 ), 2 2 1 i 1 |2 = e 4 (|z + 1 |y + 2 + |z 1 |y 2 ) = (|y + 1 |x 2 + i|y 1 |x+ 2 ), 2 2 1 i 1 |3 = e 4 (|z + 1 |y 2 + |z 1 |y + 2 ) = (|y + 1 |x+ 2 + i|y 1 |x 2 ), 2 2 1 i 1 |4 = e 4 (|z + 1 |y 2 + |z 1 |y + 2 ) = (|y + 1 |x 2 i|y 1 |x+ 2 ). 2 2
represent the four possible two-bit combinations true-true, true-false, false-true and false-false of the truth values of the propositions given in (E). Again, the two sets of Bell states in Eq. (2.22) and (2.22) might be seen as eigenbases of two mutually complementary pairs of observables with property | i |j |2 = 1 4 i, j = 1, 2, 3, 4. (2.22)
The two pairs of observables are then also complementary to the pairs of observables dened by eigenbases (2.19), (2.20) and (2.21).
57
2.2.2
j j j We now calculate the four probabilities pj 1 , p2 , p3 and p4 (j = A, B, C, D, E ) for the system in a state to give four possible combinations (true-true, true-false, false-true and false-false) of the truth values for the pair of propositions j . For example, for the pair (A) we nd
pA |z + |z + |2 is the probability to nd the spin of particle 1 1 = | z + | z + | up along z and the spin of particle 2 up along z, pA |z + |z |2 is the probability to nd the spin of particle 1 2 = | z + | z | up along z and the spin of particle 2 down along z, pA |z |z + |2 is the probability to nd the spin of particle 1 3 = | z | z + | down along z and the spin of particle 2 up along z, and pA |z |z |2 is the probability to nd the spin of particle 1 4 = | z | z | down along z and the spin of particle 2 down along z. Likewise, pD |1 |2 is the probability that both the spin of particle 1 along z 1 = | 1 | and the spin of particle 2 along 2 are the same and the spin of particle 1 along 1 and the spin of particle 2 along 2 + 90 are the same, pD |2 |2 is the probability that both the spin of particle 1 along z 2 = | 2 | and the spin of particle 2 along 2 are the same and the spin of particle 1 along 1 and the spin of particle 2 along 2 + 90 are dierent, pD |3 |2 is the probability that both the spin of particle 1 along z 3 = | 3 | and the spin of particle 2 along 2 are dierent and the spin of particle 1 along 1 and the spin of particle 2 along 2 + 90 are the same, and pD |4 |2 is the probability that both the spin of particle 1 along z 4 = | 4 | and the spin of particle 2 along 2 are dierent and the spin of particle 1 along 1 and the spin of particle 2 along 2 + 90 are dierent. To obtain these probabilities we need to perform Bell state measurements as indicated in Fig. 2.3 and project the state onto the Bell states listed above3 .
By transforming the four Bell states [Barenco et al., 1995(b)], [Bruss et al., 1996] into four product states, the Bell measurement may be reduced to two single-particle measurements. The method includes a Hadamard transformation for a single particle and a quantum controlled-NOT gate.
3
58
We summarize the individual measures of information (1.34) over a complete set of mutually complementary observations and nd [Brukner and Zeilinger, 1999(a)] 2 (4T r 2 1) (2.23) 3
Itotal = IA (p A ) + IB (p B ) + IC (p C ) + ID (p D ) + IE (p E ) =
for the total information carried by the composite system. This again is invariant under unitary transformations. Independence of the physical parameters 1 and 2 implies that the total information of the composite system is invariant under the choice of the particular set of mutually complementary pairs of propositions. Also, the total information of the composite system is conserved in time if there is no information exchange between the composite system and an environment. A composite two-qubit system in a pure state carries two bits of information. This information, contained in two propositions, can be distributed over the two particles in various ways. It may be carried by the 2 particles individually, e.g., as the two-bit combination false-true of the truth values of the propositions given in (A). This would then be represented by the product state |z 1 |z + 2 . The two bits of information are thus encoded in the two particles separately, one bit in each particle just like in classical physics. In this case there is no additional information represented jointly by the two systems. Alternatively, the two bits of information might all be carried by the two particles in a joint way, in the extreme case with no individual particle carrying any information on its own. For example, this could be the two-bit combination true-false of the truth values of the propositions given in (D). Again, this is represented by the entangled state |2 . This Bell state does not contain any information about the individuals, instead all information is contained in joint properties. In fact, now there cannot be any information carried by the individuals because the two bits of information are exhausted by dening that maximally entangled state, and no further possibility exists to also encode information in individuals. See also [Zeilinger, 1997, 1999] and [Brukner and Zeilinger 1999(d)]. We now derive our result (2.23) stated above. The direct product of the operator bases introduced in Sec. 2.1 may serve as an operator basis in the composite system. The density operator then reads
3 1 3 1 2 2 1 2 ) + 1 s1 ( I s ( I ) + Tij ( i j ). (2.24) i i j j 4 4 i=1 j =1 i,j =1 3
1 1 I I + 4 4
59
Again the factor 1/4 is due to the normalization condition T r = 1. The two 1 , s1 ) and s 2 = (s2 , s2 , s2 ), respectively, with Bloch vectors s 1 = (s1 , s x y z x y z
1 s1 j I j = T r
and
2 s2 I j , j = T r
(2.25)
determine the properties of the individual particles, while the second-rank tensor
1 2 Tij = T r i j
(2.26)
accounts for correlations between two particles. Performing the partial trace over particle 2, one obtains the reduced density operator for particle 1, i.e. 1 1 3 1 1 = T r2 = I s j . + 2 2 j =1 j
(2.27)
The reduced density operator 2 for particle 2 is calculated in an analogous way. We now calculate the trace over the square of the density matrix 1 I + 1 T rI 16 8
3 3 i=1 2 + 1 I s2 T r I j 8 j =1 j =0 3 3
T r
1 s1 i i T r
=0
1 1 1 1 2 2 + 1 s1 i j I s2 s2 T r I i j i sj T r 16 i,j =1 16 i,j =1 i j
=4ij =4ij
(2.28)
1 8
3 i,j
2 1 2 (Tij + s1 i j + i sj ) T r =0 3
1 1 1 1 1 2 Tij s1 i k + k i ) j ) k T r (( 16 i,j,k
=0 3
1 1 + 4 4
3 i=1
2 (s1 i) +
1 4
3 i=1
2 (s2 i) +
1 3 (Tij )2 . 4 i,j =1
(2.29)
From this nal sum we single out specic terms and evaluate them further. It is easy to see that
2 2 2 2 1 2 2 2 1 2 2 (s1 = (T r 1 z ) + (T r 2 z ) + (T r z z ) z ) + (sz ) + (Tzz )
60
i=1
= Likewise
3 IA (p A ). 2
(2.30)
(2.31)
3 IC (p C ). 2
(2.32)
We evaluate
1 2 2 1 2 2 1 2 2 (Tzx )2 + (Txy )2 + (Tyz )2 = (T r z x ) + (T r x y ) + (T r y z ) 1 2 1 2 = (p(z x = 1) p(z x = 1))2 1 2 1 2 + (p(x y = 1) p(x y = 1))2 1 2 1 2 + (p(y z = 1) p(y z = 1))2
(2.33)
1 2 = 1) is the probability to nd the spin of particle 1 along z where, e.g., p(z x and the spin of particle 2 along 2 to be the same. In Eq. (2.33) there appear 1 2 = 1), p( 1 2 = 1) and p( 1 2 = 1). We three typical probabilities p(z x x y y z evaluate
1 2 1 2 1 2 1 2 1 2 D p(z x = 1) = p(z x = 1, x y = 1) + p(z x = 1, x y = 1) = pD 1 + p2 1 2 = 1, 1 2 = 1) = pD , and likewise where p(z x x y 1 1 2 1 2 1 2 1 2 1 2 D p(x y = 1) = p(z x = 1, x y = 1) + p(z x = 1, x y = 1) = pD 1 + p3 . 1 2 and 1 2 commute and ( 1 2 )( 1 2 ) = 1 2 we Using the fact that z x x y z x x y y z may write
2.2 Two Qubits Carry Two Bits Entanglement Inserting the expressions for the probabilities into Eq. (2.33) we obtain
61
Repeating a similar consideration for the remaining terms in Eq. (2.28) one may easily obtain 3 (Tzy )2 + (Tyx )2 + (Txz )2 = IE (p E ). 2
(2.35)
Finally, we summarize Eq. (2.30), (2.31), (2.32), (2.34) and (2.35) and nd IA (p A ) + IB (p B ) + IC (p C ) + ID (p D ) + IE (p E ) = 2 (4T r 2 1). 3
62
2.3
Consider a system described by a n n density matrix. To specify the state of the system completely one needs n2 1 independent real numbers. Any individual, complete measurement (we consider here only complete measurements, i.e., where operators associated to the measurements are without degeneracy) with n possible outcomes denes n 1 independent probability values (the sum of all probabilities for all possible outcomes in an individual experiment is one). Therefore, just on the basis of counting the number of independent variables, we expect that the number of dierent measurements we need in order to de2 1 termine the density matrix completely is n n1 = n + 1. For example, when n equals 2 or 4, as for a qubit or a composite system consisting of two qubits, we need to make 3 and 5 dierent measurements respectively, as we have already seen. Ivanovic [1981] and Wootters and Fields [1989] demonstrated the existence of exactly n +1 mutually complementary observables by an explicit construction in the cases of n prime and n = 2k . Thus, every density matrix is uniquely dened by the set of n + 1 probability distributions over the outcomes of mutually complementary observables. The knowledge of these probability distributions is sucient to calculate the future probability for any specic measurement result. In that sense, the probabilities for all possible outcomes of all mutually complementary observables are a minimal set of physical quantities which describe the system completely. To our knowledge the question as to whether it is possible to nd n + 1 mutually complementary observables for an arbitrary dimension n of the Hilbert space is open. Here we will reduce our consideration to the cases that are known. Consider a complete set of mutually complementary observables (without 1 , ..., A n+1 . These have the property that complete knowledge of degeneracy) A the eigenvalue of any one of the observables excludes any knowledge about the eigenvalues of all other observables. That is, if the system is in an eigenstate4 i i k i = of the observable A j aj Pj and if we measure observable A , k = i the measurement results will be complete random (all measurement results will be equally probable). This implies 1 1 k ji P r T r (P ) = + ik (jr ). n n (2.36)
i denotes the projector onto the one-dimensional eigenspace associated Here, P j i with the j -th outcome ai j of the observable A . Note that other complete sets
In [Ivanovic, 1981] and in [Wooters and Fields, 1989] eigenbases of mutually complementary observables are called mutually unbiased.
4
63
of mutually complementary observables may be constructed from the original 1 , ..., A n+1 by applying an unitary transformation one A A 1 U + , ..., U +. A n+1 U U This new set of observables then also fullls the requirement (2.36). The density operator may be decomposed into the identity operator and the i [Ivanovic, 1981] projectors P j
n+1 n
(2.37)
=
i=1 j =1
i pi j Pj I,
(2.38)
i where pi j denotes the probability to observe the j -th outcome aj of the observi . Using Eq. (2.38) we obtain able A
T r 2 =
k pi j pl
k iP T rP j l
1 1 =n +ik (jr n )
n+1 n i=1 j =1
i pi j T r Pj + T r I
=1 =n
i=1 j =1
2 (pi j) 1
(2.39)
For n = 2k , when k bits of information can be encoded in individual measurements this results in
2k +1 2k
T r =
2 (pi j) i=1 j =1
2k 1 2 +1 1 1= k Ij (pj ) + k , 2 k i=1 2
(2.40)
where Ij (pj ) is the measure of information for the measurement of Aj dened by Eq. (1.32). Then the total information content of the quantum system
2k +1
Itotal =
Ij (pj ) =
i=1
2k
k 2 1) (2k T r 1
(2.41)
results in just k bits of information for the system in a pure state, and in 0 bits of information for the system in the completely mixed state. Note that the density operator can be decomposed into any complete set of mutually complementary observables. This implies that the total information
64
content of the system is independent of the particular set of mutually complementary observables considered. Instead it is a characterization of the state of the system alone, and not of the specic reference set of complementary observables. Furthermore, the total information of the system is conserved in time if there is no information exchange with the environment. Consider a composite system consisting of two uncorrelated systems. Then the density matrix (1,2) of the two systems is the product of the density matrices (1) and (2) of the individual systems. We have then T r ( (1,2) )2 = T r ( (1) )2 T r ( (2) )2 . Now, if we alternatively normalize our measure of information (1.24) as
n 2 (pi j)
I (pi ) =
j =1
1 , n
(2.42)
(1,2) Itotal
= log
n+1 n
2 (pi j) 1
i=1 j =1
= log T r (
(1,2) 2
) ,
(2.43)
we arrive at the property that the total information content of a composite system consisting of two uncorrelated (non-entangled) systems is the sum of the information contents I (1) and I (2) of individual systems, i.e. Itotal
(1,2)
(2.44)
The total information of the composite system consisting of two uncorrelated individual systems is carried by the two systems separately, with no information represented in the joint properties of the two systems.
65
A.1
The state of an individual classical pointlike system (with no rotation and internal degrees of freedom) is specied by its position r and momentum p. To determine the state of a system one has to only measure r and p. These variables together specify the value of all physical quantities of the system. When probabilities are associated with the possible values of r and p, an uncertainty arises because this probability distribution does not enable us to predict the value for a specic physical quantity exactly. Given the probability distribution (r, p, t) over the phase space, the total lack of information associated to an individual system is specied by (r, p, t) , (r, p)
d3 r d3 p (r, p, t) log
(A.45)
where (r, p) is a background measure in the phase space and indices r , p denote the set of independent variables chosen to specify the total lack of information. If we perform a joint measurement of position and momentum, we reveal a property of the system to have a certain position and momentum already existing before the measurement is performed, and remove our ignorance about that pre-existing property. To specify our ignorance we use Shannons measure, because this measure of information is adequate whenever a measurement can be assumed to reveal pre-existing property (see Sec 1.2). If instead of a joint measurement of position and momentum we perform successive measurements in which the observation of position is followed by the observation of momentum or vice versa, the total amount of information we gain will not depend on the order in which position and momentum measurements are performed. That is, an amount of information gained in the position measurement followed by a conditional momentum measurement is equal to the amount of information gained in a momentum measurement followed by a conditional position measurement
(A.46)
This is the property of independence of classical information on the order of its acquisition (see discussion in Sec. 1.2). Here H (p|r ) and H (r |p) are measures of lack of conditional information. Note that they are not measures of the lack of information concerning one particular probability distribution, but an average of measures of lack of information, each associated to the conditional
d3 r0 (r0 )
d3 p (p|r0 ) log
(p|r0 ) , (p|r0 )
(A.47)
d3 p(p, r )
(A.48)
is the conditional probability density in momentum space given the system is found at r . The background measure (p|r0 ) is an additional ingredient to the formalism that has to be added to ensure invariance under a variable change when we consider continuous probability distributions [Jaynes, 1962]. The total lack of information of the system is independent of the particular choice of the complete set of variables considered. If instead of r and p we consider a new set of independent variables which describe the system completely and are related to the old ones by the transformations r = r (r, p) and p = p (r, p) we nd
r,p Htotal (t) =
d3 r d3 p (r , p ) log
(r , p ) (r , p ) (A.50)
r ,p = Htotal (t),
where J denotes the Jacobian of the transformation. If a system is dynamically independent of the environment and not exposed to a measurement, there is no information exchange with the environment and the total lack of information of a classical system remains conserved in time. Introducing the notation = (r, p) and d = d3 rd3 p, we calculate dHtotal d = dt dt d log = d log t d . t (A.51)
67
The second integral vanishes due to the conservation of the total probability d = 1. Using the Lioville equation t + t () = 0 we transform the rst integral into dHtotal = dt (). t
d log
(A.52)
(A.53)
The integrand function t results in zero. This is an immediate consequence of the fact that evolution of the phase point in time follows the Hamiltonian = H and p = H . Therefore nally we obtain equations r p r
dHtotal = 0. dt
(A.54)
68
69
Chapter 3
In his article Information is Physical Landauer [1991] writes: Quite likely we are in a nite universe. In any case, Nature is unlikely to be so cooperative as to enable us to bring together an unlimited memory ... Information handling is limited by the laws of physics and the number of parts available in the universe; the laws of physics are, in turn, limited by the range of information processing available. A closely related view was assumed by Feynman: It always bothers me that, according to the laws as we understand them today, it takes a computing machine an innite number of logical operations to gure out what goes on in no matter how tiny a region of space and no matter how tiny a region of time, ... why should it take an innite amount of logic to gure out what one tiny piece of space-time is going to do? Our physical description of the universe is represented by propositions. Any physical system can be described by a set of propositions together with their truth values - true or false. Any propositions we might assign to a system are arrived at only by observation, and represent our knowledge, i.e., information, of a system gained through observation. It is natural to assume that if we decompose a physical system, which may be represented by numerous propositions, into its constituent systems, each such constituent system will be described by fewer propositions. Thus the information content of a system scales with its size. How far then can this process of subdividing a system go? We reach a 71
72
nal limit when an individual system represents the truth value to one single proposition only. Such a system we call an elementary system. It is then suggestive that a principle of quantization of information [Zeilinger, 1999]: The most elementary system represents the truth value of one proposition. may serve as a foundational principle for quantum mechanics. We notice that the truth value of a proposition can be represented by one bit of information with true being identied with the bit value 1 and false being identied with the bit value 0. Thus, our principle becomes simply: The most elementary system carries 1 bit of information. This is then the reason for the irreducible randomness of an individual quantum event, for quantum complementarity, for the well-known sinusoidal dependence between probabilities and laboratory parameters (Sec. 3.3), and for quantum entanglement. We remark that our principle might be interpreted as a denition of what is the most elementary system. We emphasize again that by the notion a system represents the truth value of a proposition we mean a statement that can be veried directly by experiment. Any propositions we might assign to a physical system are arrived at only by observation. In the spirit of Bohr1 we always mean by observation observation of properties of our classical apparatus. We relate our notion of an elementary system to the ur object developed by von Weizs acker, who writes [1975]: It is certainly possible to decide any large alternative step by step in binary alternatives. This may tempt us to describe all objects as composite systems composed from the most simple possible objects. The simplest possible object is an object with a twodimensional Hilbert space, the ur. The word ur is introduced to have an abstract term for something which can be described by quantum theory and has a two-dimensional Hilbert space, and nothing more. Thus the ur, an elementary system or the qubit a new phrase in the theory of quantum information and quantum computation can all be seen as dierent words for the same entity. An explicit example of an elementary system is the spin of a spin-1/2 particle. Evidently, a physical elementary particle also carries other properties being
1 Bohr [1949] emphasized that: How far the [quantum] phenomena transcend the scope of classical physical explanation, the account of all evidence must be expressed in classical terms. The argument is simply that by the word experiment we refer to a situation where we can tell others what we have done and what we have learned and that, therefore, the account of the experimental arrangement and the result of observation must be expressed in unambiguous language with suitable application of the terminology of classical physics.
73
thus elementary in more way than one. Also, some of these properties may be represented by n-fold alternatives [von Weizs acker 1975, 1985] rather than just the two-fold alternative represented by an elementary system. Assume a measurement of the spin along the z-axis yields some denite result. The spin-1/2 particle then represents the truth value of only one proposition: The spin along the z-axis is up. Since this is the only information the spin carries, measurement along any other direction must necessarily contain an element of randomness. As a simple consequence of the fact that an elementary system cannot carry enough information to provide denite answers to all questions that could be asked experimentally, probability arises as an irreducible concept, incapable of further reduction ... a primary fundamental notion of physics as remarked by Pauli [1955]. We emphasize that this kind of randomness must then be irreducible, that is, it cannot be reduced to hidden properties of the system. Otherwise the system would not be elementary for spin, that is it would carry enough information to assign truth values to more than one proposition. To illustrate this point we will briey review the GHZ-argument [Greenberger et. al, 1989, 1990], [Mermin, 1990] for refutation of the EPR-program [Einstein et. al, 1935]. Let us consider three space-like separated spin-1/2 particles in a GHZ-state | 1 = (|z +, z +, z + |z , z , z ). 2 (3.1)
GHZ
(3.2)
with eigenvalue unity, the product of the results of the three individual spin measurements has to be +1 (in units of 1/2 h) for each set of operators in Eq. (3.2). This aords immediately application of the EPR reality criterion [Einstein et. al, 1935]: If, without any disturbing a system, we can predict predict with certainty the value of a physical quantity, then there exists an element of physical reality corresponding to this physical quantity. The element of physical reality ought then to exist independent of whether or not an observer actually cares to measure it. Because the product of the results of measuring one x component and two y components is unity in our state, we can predict with certainty the result of measuring the x component of the spin of any one of the three particles by measuring the y components of the other two, far away particles. According to
74
2 3 the EPR reality criterion this then asserts spin values s1 x , sx and sx (= +1 or -1) specifying three truth values of the propositions: The spin of particle 1 is up along the x direction, The spin of particle 2 is up along the x direction, and The spin of particle 3 is up along the x direction respectively.
In much the same way we can also predict the result of measuring the y component of the spin of any particle with certainty, by measuring one x component and one y component of the spins of other two. Again, the EPR 2 3 reality criterion asserts the spin values s1 y , sy and sy (= +1 or -1) that now specify the truth values of the propositions The spin of particle 1 is up along the y direction, The spin of particle 2 is up along the y direction, and The spin of particle 3 is up along the y direction respectively. All six truth values have to be pre-assigned, because we can predict in advance what any one of the six truth values will be by measurements performed far enough separated that they cannot disturb the assigned truth values that are indeed subsequently revealed in appropriate measurements.
2 3 1 2 3 1 2 3 Because the values of the corresponding products s1 x sy sy , sy sx sy and sy sy sx must be unity, so is their combined product. Since each individual si y is either +1 or -1 and each occurs twice in the combined product, that combined product 2 3 is s1 x sx sx = 1. However one easily veries that 1 2 3 1 2 3 1 2 3 1 2 3 x x x = ( x y y )( y x y )( y y x ).
Since the GHZ-state (3.1) is an eigenstate with eigenvalue +1 of each of three 1 2 3 with eigenvalue operators in Eq. (3.2), it will be also in an eigenstate of x x x -1. This is opposite in sign to the one required by the EPR-program. Thus the failure of the EPR hypothesis can be shown in principle in a single experimental trial. This powerful refutation of the EPR reality criterion emerges quite directly from one crucial dierence between the assumption of elements of reality and the principle of quantization of information. While EPR-program asserts two bits of information to each individual particle, our foundational principle for quantum mechanics allows us to assert one bit of information to each individual particle only. The reality assumption deals with information in an extremely non-economical way. In a deterministic hidden variables theory, the spin values are assigned along all possible directions and consequently an individual particle carries an innite amount of information. In contrast, the EPR reality assumption does not assume the determinism as an extra premiss. Applied to the GHZ-state the EPR reality assumption implies elements of physical reality which predeter-
75
mine the outcomes of spin measurements along two directions for an individual particle and consequently the individual particle carries just two bits of information. This assumption, however, is sucient to show an extreme conict with predictions of quantum mechanics. We remark that the foundational principle suggested above lends natural support to Bohrs notion of complementarity. The notion is well-known, for example, for position and momentum, for the path of the system and the position of appearance in the interference pattern in the double-slit experiment or for spin values of a spin 1/2-particle along orthogonal directions. The reason that precise knowledge of one quantity excludes any knowledge of other complementary quantities is again our principle of quantization of information. This will be analyzed in detail in Sec. 3.2. Quantum entanglement follows from slight generalization of the principle of quantization of information. We analyze how much information is contained in more complex systems consisting of N elementary systems. It is natural to assume that information content of a complex system increase with its size. In fact, we propose that information content of a complex system is proportional to the number of constituent elementary systems. We have our principle of quantization of information generalized to [Zeilinger, 1997, 1998] N elementary systems represent the truth values of N propositions, or equivalently, N elementary systems carry N bits. We remark that our principle does not make any statement of how the information contained in the N propositions is distributed over the N systems. Evidently, there are many ways. We may consider N elementary systems which represents N independent individual propositions. Then information contained in the N propositions is represented by the N systems individually, evidently each elementary system represents just one individual proposition. Alternatively, information contained in the N propositions might be represented by the N systems in a joint way, in the extreme case with no individual system carrying any information on its own. In that case we have complete entanglement. See Sec. 2.2 for an analysis of the entanglement in the view of the ideas presented here. In this view we may also characterize the composite system in our example of the maximally entangled three-particle GHZ state (3.1). The composite system of the three particles in the GHZ state represents the truth values of the three
76
(compare with six propositions as assumed by the EPR program) propositions: 2 3 1 2 3 The product of spin values s1 x sy sy is 1, The product of spin values sy sx sy is 2 3 1, and The product of spin values s1 y sy sx is 1 (The product of spin values 2 3 s1 x sx sx is -1). The proposition in the parenthesis is an alternative choice for the third proposition. In the maximally entangled three-particle GHZ state the information is only expressed in terms of relational properties of the three particles, and thus no individual particle carries any information, all information being contained in entanglement. With respect to the three statements above, we can dene three projection operators: 1 = |x+, y +, y + x+, y +, y + | + |x+, y , y x+, y , y | P + |x, y , y + x, y , y + | + |x, y +, y x, y +, y |, 2 = |y +, x+, y + y +, x+, y + | + |y +, x, y y +, x, y | P + |y , x, y + y , x, y + | + |y , x+, y y , x+, y |, 3 = |y +, x+, y + y +, x+, y + | + |y +, x, y y +, x, y | P + |y , x, y + y , x, y + | + |y , x+, y y , x+, y |. For a Hilbert space H the GHZ state can then be obtained as | 1 P 2 P 3 H . P
GHZ
(3.3)
We see our generalized principle as a quantitative formulation of Schr odingers [1935] idea that If two separated bodies, each by itself known maximally, enter a situation in which they inuence each other, and separate again, then there occurs regularly that which I have just called entanglement of our knowledge of the two bodies. The combined expectation-catalog consists initially of a logical sum of the individual catalogs; during the process it develops causally in accord with known law (there is no question whatever of measurement here) the knowledge remains maximal, but at its end, if the two bodies have again separated, it is not again split into a logical sum of knowledges about the individual bodies.
77
3.2
In this section, it will be argued that the degree of randomness in a spin measurement must depend on the relative orientation between the measurement direction and the direction along which the spin-1/2 system gives a denite measurement result. We will show that the extreme case, with completely random measurement results, is realized when these directions are orthogonal to each other. From the (three-) dimensionality of space it immediately follows that there are altogether three mutually complementary spin measurements. Our fundamental principle implies that in quantum mechanics it is not possible to assert denite truth values to all conceivable propositions simultaneously. A knowledge of the system obtained through earlier observation permits the prediction of individual outcomes with certainty for only a very limited class of experiments. These are the experiments which have denite outcomes, a situation which is abbreviated by saying that the corresponding propositions have denite truth values. A state of a physical system can be specied by listing true propositions, however for indenite propositions only probabilistic predictions can be made. Consider a stationary experimental arrangement with two detector plates, where only one detector plate is hit by a particle at a time, i.e. in each experimental trial. An explicit example would be the Stern-Gerlach experiment with a spin-1/2 particle as depicted in Fig. 3.1 schematically. Depending on whether the upper or the lower detector plate was hit by a particle we call the outcome yes and no respectively. The upper detector plate is hit with probability p. If it is not hit the other detector plate will be hit with probability q = 1 p. We are thus dealing with a binary alternative. In the specication of the state of the spin-1/2 particle, the truth values of those propositions that are associated to the system together are assigned consistently. For example, the true proposition The spin along the z-axis is up and the false proposition The spin along the -z-axis is up assigned together to the system are logically consistent and can be empirically veried by an analysis of the measuring process in which the magnet in the Stern-Gerlach apparatus will be oriented in opposite directions. In what follows we assume that dierent experimental situations are always specied by the orientation of the magnet in the Stern-Gerlach apparatus as given in Fig. 3.1. Consider a state of a spin-1/2 particle which is specied by the true proposition The spin along the z -axis is up, or equivalently by the false proposition The spin along the z -axis is up. This situation is described
78
Figure 3.1: Spin measurement of a spin-1/2 particle. The particle passes through the Stern-Gerlach magnet oriented at the angle , and then it hits one of the detector plates behind the Stern-Gerlach magnet. Depending on whether the upper or the lower detector plate is hit by a particle we call the outcome yes and no, respectively. by the probabilities p(0) = 1 and p( ) = 0 for the yes outcome. Because of the fundamental limitation of the information a spin-1/2 particle can carry, each proposition: The spin along the direction tilted at angle (0 < < ) from the z-axes is up has to be probabilistic for a particle in that state (Fig. 3.2). How does the probability of a yes count depend upon the angle ? We argue that the mapping of to p( ) has to be analytic and monotonic2 . We suggest that the reason for analyticity is again the fact that an elementary system carries information for one single proposition only. If p( ) would only be sectionally analytic in then there would be points of nonanalyticity separating two regions in which the function p( ) has dierent analytic forms. Thus the values of the function on a nite segment in the interior of a domain of analyticity would only determine, by the uniqueness theorem for analytic functions, the function up to the next point of nonanalyticity. Clearly, to describe such a system completely we would need catalogs both of functional values on nite segments in the interior of each domain of analyticity and of the positions of the points of nonanalyticity. Both catalogs would imply that our system carries more information than allowed by the principle of quantization of information. The monotonicity will be conrmed in Sec. 3.3 where we will obtain the function p( ) explicitly. We will now make use of the Cauchy theorem about continuous and monotonic functions: if a continuous and monotonic function f (x) takes unequal values f (a) = A and f (b) = B in two dierent point a and b (a < b), then for each number C between A and B there is one and only one point c between a and b such that f (c) = C (a < c < b; A < C < B or A > B > C ) (the function f (x) goes through all values between A and B once). According to the Cauchy theorem about continuous and monotonic functions, there has to
It turns out that analyticity is not necessary as a separate condition and that it suces to assume the continuity condition. Analyticity follows then immediately (Appendix B.1).
2
79
Figure 3.2: The gradual change of the probability p() of a yes (spin up) count with a gradual change of the orientation of the magnet in the Stern-Gerlach apparatus. The measurement along the z-axis gives result yes with certainty. Because of the symmetry of the problem the probabilities for a yes and for a no count in a measurement along any direction in the x-y plane (the green circle) are equal (=1/2). How does the probability p() of a yes count depend on explicitly? be one and only one angle of orientation of the magnet in the Stern-Gerlach apparatus where the probabilities for a yes and for a no outcome are equal. Because of the symmetry of the problem this obviously has to be the angle /2. For each direction n in the x-y plane (the green circle on the sphere in Fig. 3.3) the proposition The spin along the n-axis is up is completely indenite, that is, we have absolutely no knowledge which outcome yes or no will be observed in a specic individual measurement. In principle, this equal number of yes-no outcomes could be achieved by an ensemble of systems each giving a denite result for each direction such that the same number of yes or no results is obtained. Yet again this would imply that an individual system carries enough information to permit assignment of denite truth values to all possible propositions, in contradiction to our basic principle. Consider now the state of a spin-1/2 particle specied by the proposition The spin along the x-axis is up (down). In this case we have complete knowledge which outcome will be observed when the Stern-Gerlach magnet is oriented along the x-axis at the expense of the fact that we have absolutely no knowledge about the outcome when the magnet is oriented along any direction in the y-z plane (the yellow circle on the sphere in Fig. 3.3). Finally, consider the state of a spin-1/2 particle specied by the proposition The spin along the y-axis is up (down). In that case we know precisely the
80
Figure 3.3: The formation of mutually complementary propositions associated with orthogonal spin components. If measurement along the z-axis (x-axis) [y-axis] gives a denite result, measurement along any direction in the x-y plane, the green circle (y-z plane, the yellow circle) or [x-z plane, the red circle] will be maximally random, respectively. There are altogether three mutually complementary spin measurements represented by three intersection points of the green, yellow, and red circle. outcome of the experiment when the Stern-Gerlach magnet is oriented along the y-axis at the expense of our complete uncertainty about the outcome when the magnet is oriented along any direction in the x-z plane (the red circle on the sphere in Fig. 3.3). Obviously, there are altogether three mutually exclusive or complementary propositions (represented by three intersection points of the green, yellow and red circle on the sphere in Fig. 3.3): The spin along direction n1 is up (down), The spin along direction n2 is up (down) and The spin along direction n3 is up (down) where n1 , n2 and n3 are mutually orthogonal directions. These are propositions with a property of mutually exclusiveness: the total knowledge of one proposition is always at the expense of total ignorance about the other two complementary ones. Precise knowledge of the outcome of one experiment therefore implies that all possible outcomes of complementary ones are equally probable. We emphasize that the total number of three mutually complementary propositions for the spin might be seen as a consequence of the (three-) dimensionality of the space. Since the theory of spin-1/2 particles aords a model of the quantum mechanics of all two-state systems, we conclude that there are always three mutually complementary propositions whenever binary alternatives are investigated.
81
3.3
Quantum theory predicts p( ) = cos2 (/2) for the probability to nd the spin up along the direction at an angle with respect to the direction along which the system gives spin up with certainty. We ask: From what deeper foundation emerges this law, initially formulated by Malus3 for light, in quantum mechanics? The most important contributions so far in that direction are those of Wootters [1981], Summhammer [1988, 1994] and Fivel [1994]. In this section we argue that the most natural function between the probability for a specic outcome to occur and laboratory parameters consistent with the principle that an elementary system carries only one bit of information must be the sinusoidal dependence. Consider again a stationary experimental arrangement with two detectors, where only one detector res in each experimental trial. Detector 1, say, res (we call this the yes outcome) with probability p. If it does not re (the no outcome) then the other detector res with probability q = 1 p. The experimenters measure of information about which individual outcome yes or no will occur in a single future experimental trial is given by (see Sec. 1.3) I (p, q ) = (p q )2 . (3.4)
This measure is invariant under permutation of the set of possible outcomes. In other words, it is a symmetrical function of p and q . A permutation of the set of possible outcomes can be achieved in two manners, which may be called active and passive. In the passive point of view the permutation is obtained by a simple relabelling of the possible outcomes and the property of invariance is self evident because relabelling obviously does not make an experiment more predictable. From the active point of view, one retains the same labelling, and the permutation of the set of outcomes refers to a change of the experimental set-up. For a spin measurement this would be a re-orientation of the Stern-Gerlach magnet. In that case the property of invariance states that the amount of information is indierent under real physical changes of the experimental situation. This requirement is more stringent and may be precisely formulated as an invariance of the measure of information under interchange of the following two physical situations: a) the probability for yes is p and for no is q ; and b) the probability for yes is q and for no is p. But these are dierent experimental
Etienne Louis Malus (1775-1812), a French physicists, was almost entirely concerned with the study of light. He conducted experiments to verify Huygens theories of light and rewrote the theory in analytical form. His discovery of the polarization of light by reection was published in 1809 and his theory of double refraction of light in crystals in 1810.
3
82
Figure 3.4: Various sets of three mutually complementary Stern-Gerlach arrangements labelled by a single experimental parameter which species the orientation of the Stern-Gerlach magnet in each of the experiments. They are associated to the following sets of mutually complementary propositions: P1 (): The spin along the x-axis is up, P2 (): The spin is up along the direction tilted at angle from the z-axes and P3 (): The spin is up along the direction tilted at angle + 90 from the z-axes. situations corresponding to dierent information. In order to remove this ambiguity we can associate with each specic outcome its probability for occurrence, or assign dierent numbers or other distinct labels to possible outcomes, the particular scheme is of no further relevance. We use a quantity i := p q, (3.5)
because it species also the amount of information by I = i2 . We call this quantity information with respect to a single specic measurement, because it is the whole information of a particular physical situation equivalent to the assigning of specic probabilities for each of the possible results. All the quantum state is meant to be is a representation of that catalog4 of our knowledge of the system that is necessary to arrive at the set of, in general probabilistic, predictions for all possible future observations of the system. We describe a system by a catalog of information (information vector) i = (i1 , i2 , i3 ) about a complete set of mutually complementary propositions.
A set of complex amplitudes of a -function is a specic representation of the catalog of our knowledge of the system. This view was assumed by Schr odinger [1935] who wrote: Sie ((die -Funktion )) ist jetzt das Instrument zur Vorausage der Wahrscheinlichkeit von Mazahlen. In ihr ist die jeweils erreichte Summe theoretisch begr undeter Zukunfterwartungen verk orpert, gleichsam wie in einem Katalog niedergelegt. Translated: It (the -function) is now the means for predicting the probability of measurement results. In it is embodied the momentarily attained sum of theoretically based future expectation, somewhat as laid down in a catalog.
4
83
3
i( )
i3 () 2 i1 ()
P1 () P2 () P3 ()
Parametric axis
i2 ()
Space of information
Figure 3.5: Representation of the state of a quantum system by the information vector i(). The components i1 (), i2 (), i3 () of the information vector take the values of the information associated to three mutually complementary propositions P1 (), P2 (), P3 (). Such a set of propositions is, for example, P1 ( ): The spin is up along the direction x, P2 ( ): The spin is up along the direction at the angle , and P3 ( ): The spin is up along the direction at the angle + 90 . Here, the direction is assumed to be by lying in the y-z plane oriented at an angle with respect to the z-axis. Note that dierent lists of three mutually complementary propositions are labelled by a single experimental parameter as given in Fig. 3.4. They correspond to dierent representations i( ) = (i1 ( ), i2 ( ), i3 ( )) of the catalog of our knowledge of the system as shown in Fig. 3.5. We wish to specify a mapping of onto i( ). It is of importance to note that we can invent this mapping freely. The reason for this is that will have functional relations to other physical parameters of the experiment. Then, the laws relating those parameters and the information vector i( ) can be seen as laws between those parameters and plus a mapping of to i( ). What basic criteria should we follow to obtain the mapping from to i( )? We will use a mapping where neighboring values of correspond to neighboring values of i( ). Thus, if we gradually change the orientation of the magnet in each of mutually exclusive Stern-Gerlach apparati, this will result in a continuous change of the information vector. We now dene the total information content Itotal of a system as the sum of the individual measures of information of a complete set of mutually complementary propositions.
2 2 Itotal = I1 + I2 + I3 = i2 1 + i2 + i3 .
(3.6)
We require that the total information content of a system is invariant under the change of representation of the catalog of our knowledge of the system, that is, independent of the particular set of mutually complementary propositions considered (see Fig. 3.6). In the same spirit as choosing a coordinate system,
84
Figure 3.6: Two dierent sets of mutually complementary spin measurements (the
fully set includes also the spin measurement along the x-axis which is not shown in the gures). They correspond to the following two sets of mutually complementary propositions: {P1 (0): The spin along the x-axis is up, P2 (0): The spin along the y-axis is up, P3 (0): The spin along the z-axis is up}, and {P1 (): The spin along x-axis is up, P2 (): The spin along the direction tilted at angle from the z-axes is up, P3 (): The spin along the direction tilted at angle + 90 from the z-axes is up}. The total information carried by the spin is independent of the particular set of mutually complementary propositions considered, i.e. Itotal = I1 (0) + I2 (0) + I3 (0) = 1 + 0 + 0 = 1 = I1 () + I2 () + I3 () in the example shown.
one may then choose any set of mutually complementary propositions to represent our knowledge of the system, the total information about the system being invariant under that choice, i.e. for all
2 2 Itotal = I1 ( ) + I2 ( ) + I3 ( ) = i2 1 ( ) + i2 ( ) + i3 ( ).
(3.7)
for the information content of an elementary system. The maximal value of one bit of information is reached when only one single proposition with a denite truth value is assigned to the system (pure state). Note that we also include the cases when an elementary system carries an amount of information less than what is necessary to assign a denite truth value to one single proposition. This may occur when an elementary system is a constituent of a larger composite system, i.e. when an elementary system is entangled with other elementary systems. Then, the total information content of the composite system
85
might be partially encoded in specic joint properties of the composite system, leaving less than 1 bit of information to specify the properties of individual constituents (see Sec. 2.2). In the case of maximal entanglement there cannot be any information carried by the individual constituents, because the complete information of the composite system is exhausted in dening joint properties, and no further possibility exists to also encode information in the individual constituents, i.e. the information content of the individual constituents attains its minimal value of 0 bits of information. Then, no proposition with a denite truth value can be made about the individual elementary systems (completely mixed state). The property of invariance of the total information carried by an elementary system implies that with a gradually changed experimental parameter from 0 to 1 the information vector rotates in the space of information with conservation of the length of the information vector as given in Fig. 3.7, i.e. (1 0 , 0 )i(0 ), i(1 ) = R (1 0 , 0 ) is an orthonormal matrix where R 1 (1 0 , 0 ) = R T (1 0 , 0 ). R Notice that transformation matrices do not build up a group in general because of the explicit dependence on both the initial and nal parametric value. Equation (3.9) expresses our expectation that the transformation law is linear, that is, independent of the actual information vector transformed. No particular information vector is preferred by our foundational principle and the relation of equivalence between all possible information vectors is maintained by the transformation law. That means precisely that initially equally distributed information vectors in the whole space of information will be transformed again in equally distributed nal information vectors. See also Sec. 3.6 for a detailed discussion and for the relation between the linearity of evolution law (there the parameter is time) and the no-superluminal-signaling requirement. We assume that no physical process a priori distinguishes one specic value of the physical parameter from others, that is, that the parametric -axis is homogeneous. In our example with the orientation of Stern-Gerlach magnets as an experimental parameter, the homogeneity of the parametric axis becomes equivalent to the isotropy of the ordinary space. The homogeneity of the parametric axis precisely requires that if we transform physical situations of three complementary experiments together with the state of the system along the parametric axis for any real number b, we cannot observe any eect. Suppose (3.9)
86
i(1 ) i(0 )
2 1
P1 (0 ) P2 (0 ) P3 (0 ) 0
P1 (1 ) P2 (1 ) P3 (1 ) 1
Space of information
Parametric axis
Figure 3.7: A generalized rotation of the information vector from i(0 ) to i(1 ) due to a change of the physical parameter from 0 to 1 . two lists of mutually complementary experimental arrangements are associated to a specic parametric value 0 and to some other value 0 + b ( < b < +) respectively. Furthermore, suppose the information vectors i(0 ) and i(0 + b) associated to the two lists of complementary experiments are equal (i.e. all components of the two vectors are equal). The homogeneity of the parametric -axis then requires that if we change the physical parameter in each experiment by an equal interval of 0 in the two lists of complementary experiments, the resulting information vectors will be equivalent as shown in Fig. 3.8. Formally, ( 0 , 0 )i(0 ) = R ( 0 , 0 + b)i(0 + b), if i(0 ) = i(0 + b) for all 0 implies R then5 ( 0 , 0 ) = R ( 0 , 0 + b). R (3.10)
The transformation matrix then depends only on the dierence between the initial and nal value of the experimental parameter, and not on the location of these values on the parametric -axis. The orthogonality condition leads to the following general form of the transformation matrix 1 0 0 R( ) = 0 f ( ) g( ) , 0 g( ) f ( )
(3.11)
where f ( ) and g( ) are not yet specied but continuous functions satisfying f 2 ( ) + g2 ( ) = 1, f (0) = 1 and g(0) = 0.
5
(3.12)
We give another line of reasoning, that is to require the same functional dependence of the transformation law for each initial value 0 of the parameter. This can only be done with Eq. (3.10).
87
3
i()
Space of information
Space of information
Figure 3.8: The homogeneity of the parametric -axis. Here we take 0 = 0 for simplicity. We further require that a change of the experimental parameter in a set of mutually complementary arrangements from 0 to 1 and subsequently from 1 to 2 must have the same physical eect as a direct change of the parameter from 0 to 2 . The resulting transformation will then be independent, whether (1 0 ) and R (2 1 ) or a single we apply two consecutive transformations R (2 0 ) transformation R (2 0 ) = R (2 1 )R (1 0 ). R (3.13)
This together with the property that for = 0 (no changes of the physical situations of the complementary experiments) the transformation matrix equals (0) = I ) implies that transformation matrices build up the the unity matrix (R group of rotations SO(3), a connected subgroup of the group of orthogonal matrices O(3) which contains the identity transformation. For the special case of innitesimally small variation of the experimental conditions, eq. (3.13) reads ( + d ) = R ( )R (d ). R (3.14)
Inserting the form (3.11) of the transformation matrix into the latter expression, one obtains f ( + d ) = f ( )f (d ) g( )g(d ). (3.15)
88
necessary and sucient condition for their analyticity. Using conditions (3.12), we now may transform Eq. (3.15) into the dierential equation df ( ) = n 1 f 2 ( ), d
(3.16)
where n = g (0) is a constant. The solution of the dierential equation reads f ( ) = cos n, (3.17)
where we integrate between 0 and using the condition f (0) = 1. This nally leads to 1 0 0 R( ) = 0 cos n sin n . 0 sin n cos n And this result directly leads to the familiar expression p = cos2 n 2
(3.18)
(3.19)
for probability in quantum theory. In a world whose most elementary constituents give a denite result in one specic experiment only, the probability must vary as cos2 n . Our world is built just this way, with n = 1/2 for electrons and neutrinos and a relative orientation between the spin vector and the measurement direction in the Stern-Gerlach experiment, or with n = 1 for photons and a relative orientation between the polarization vector and the Nicol prism in the polarization experiment [Brukner and Zeilinger, 1999(b)], or with n = 2 for gravitons and for a relative polarization angle, or with the phase shift = n between two paths inside the interferometer in the interference experiment with a Mach-Zehnder type of interferometer. For the latter see also Sec. 2.1 and 3.4.
89
3.4
The concept of the deBroglie wavelength is often used in obtaining a picture of the deBroglie wave as a real wave all over space. Yet, the deBroglie wavelength has rather strange properties not found in any classical wave which not only limits its usefulness for obtaining such a picture of the deBroglie wave, but also make this picture false [Zeilinger, 1990]. For example, the deBroglie wavelength is only dened through interference experiment, i.e., when at least two paths P 1 and P 2, albeit possible neighboring ones, interfere ei
P1
k dr
+ ei
P2
kdr
=e
P 1+P 2
kdr
= ei
kdr
The deBroglie wavelength therefore is not gauge invariant. Strictly speaking, the deBroglie wavelength only appears in experimental predictions in the form of a path integral over closed loops kdr . Another interesting property is that the deBroglie wavelength is not Galilei invariant [L evy-Leblond 1974, 1976]. To the contrary, it changes according to k = k + mv/ h. This may be seen as a corollary of Eq. (3.20) because mvdr = 0. None of these properties are shared by classical waves, and therefore the deBroglie wavelength does not have an immediate conceptual signicance. It merely evidences itself as an aid to calculating interference pattern, which again means that it only helps us to calculate statistical predictions of the distributions of particles in an interference experiment. Since the deBroglie wavelength is only dened through interference experiments and the interference is further closely related to information, as identied through the observation that interference appears whenever the particle is measured such that this measurement is not able, not even in principle, to reveal any information about the path the
90
Figure 3.9: Principle sketch of three mutually exclusive interference experiments with a Mach-Zehnder type of interferometer. The length of the upper path inside the interferometer is adjustable. Into each of the two paths inside the interferometer in Fig. 3.9b a detector is inserted with the property that it detects the particle without absorbing it.
particle takes, we suggest that the deBroglie wavelength must be based on the much more fundamental concept of information. In what follows we shall analyze mutually complementarity propositions in an interference experiment with an idealized Mach-Zehnder type of interferometer where the length of the upper path inside the interferometer is adjustable (Fig. 3.9). Suppose that in the presence of the extension x of the upper path inside the interferometer, the particle will exit with certainty towards the upper (lower) detector behind the beam splitter. In this case we have complete knowledge of the beam the particle will be found in behind the beam splitter at the expense of the fact that we have absolutely no knowledge of which path the particle took inside the interferometer. By a gradual change of the extension of the path inside the interferometer, our knowledge of the beam the particle will be found in behind the beam splitter will decrease. In the extreme case of the presence of the extension x + /4 of the upper path inside the interferometer, we have absolutely no knowledge which path the particle will take behind the beam splitter. Analogous to the interference experiment analyzed in Sec. 2.1 we have three mutually complementary propositions. The various complete sets of mutually complementary propositions might be labeled by a parameter x: P1 (x): The particle takes the outgoing path towards the upper detector in the presence of the extension x of the path, P2 (x): The particle takes the upper path inside the interferometer, and P3 (x): The particle takes the outgoing path towards the upper detector in the presence of the extension x + /4 of the
91
Which-path 2 information
i(x)
1
Space of information
Figure 3.10: One complete rotation of the information vector after an extension of
of the path inside the interferometer in Fig. 3.9.
path. Note that has dimension of the length in the ordinary space. By a gradual change of the experimental parameter x in three mutually exclusive arrangements we may reduce our knowledge about one of the propositions P1 (x) or P3 (x) increasing correspondingly our knowledge about the other. Since ordinary space is homogeneous, that is, there is no physical process that distinguishes one location in ordinary space from others, no specic value of the parameter x will be preferred. This then leads immediately to Malus law for the quantum interference experiment. According to the general solution (3.18), the continuous extension of the upper path inside the interferometer in the two mutually exclusive experiments in Fig. 3.9a and Fig. 3.9c will result in a periodic change of information between i1 (x) and i3 (x) regarding the beam the particle will be found in behind the beam-splitter in the two complementary experiments respectively. In the space of information this corresponds to a rotation of the information vector around the axis that is associated to the which-path information (the change of the length of the path inside the interferometer does not aect our information i2 (x) about the particles path inside the interferometer). We therefore obtain i1 (x) = i1 (0) cos(kx) i3 (0) sin(kx) i2 (x) = i2 (0) i3 (x) = i1 (0) sin(kx) + i3 (0) cos(kx), where k has dimension [1/length]. This shows again the existence of a minimal extension xmin = 2/k =: of the path of the particle inside the interferometer (in the ordinary space) for which the information vector makes one complete rotation in the space of information (Fig. 3.10). We therefore dene the deBroglie wavelength as the
(3.21)
92
minimal extension of the path inside the interferometer after which information about any proposition P (that diers from P2 ) takes the same value. Formally, the deBroglie wavelength is dened through the relation iP (x + ) = iP (x) P = P2 . (3.22)
93
3.5
Dynamics of Information
Any assignment of properties to an object is only due to observation. Using information obtained in previous observations we wish to make predictions about the future. Again our predictions might be formulated as, in general probabilistic, predictions about future properties of a system. Clearly, these predictions can be veried or falsied by performing measurements and checking whether the experimental results agree with our predictions. It is then important to connect past observations with future observations. Or, more precisely, to make specic statements about results of future observations based on past observations. In quantum mechanics this connection between past observation and future observation exactly is achieved by the Liouvilles equation (for pure states it reduces to the Schr odinger equation) d (t) (t), = [H (t)]. dt
i h
The initial state (t0 ) represents all our information as obtained by earlier observation. Using the Liouville equation we can derive a time evolved nal state (t) at some future time t. This state represents our knowledge necessary to arrive at a set of, in general probabilistic, predictions for any possible future observation of the system. In this section the dynamics of a quantum system is formulated as a time evolution of the catalog of our knowledge of the system. This is specied by the evolution of the information vector in the space of information. The Liouville equation will be derived from the dierential equation describing the motion of the information vector in the information space. From the dynamical point of view one may alter ve classes of physical systems in a way that each class is a special case of the previous one: 1. A physical system and rest of the world are two subsystems, that is, the physical system has kinematical independence from its environment: there are certain physical properties and/or parameters that uniquely and at every time determine the system (see below for examples). 2. A physical system has a dynamical independence from its environment, that is, the system is a kinematically independent subsystem of the world under the inuence of the environment (in a special case this inuence is not present), but this inuence is not changed by the change of the state of the system (there is no back-reaction from the system to the environment). To give a concrete counter example, that is, an example for a system that
94
Chapter 3: Information and the Structure of Quantum Theory is kinematically but not dynamically independent, consider a neutron in a deuterion (the nuclei consisting of a proton and a neutron in the nuclear interaction). One can not dene an external eld for a neutron because the inuence of a proton on a neutron depends on actual state of the neutron and changes with the change of this state. 3. A physical system is conservative when the energy of the system is constant in time although the external eld could be present. An opposite example, that is, an example of a dynamically independent but nonconservative system, we observe whenever there is pumping of energy from the environment to the system (a proton in a variable electromagnetical eld of a cyclotron) or, when the environment takes energy from the system (a particle in a Wilson-chamber). 4. A physical system is isolated when it is not in the external eld, that is, when it is not under inuence of the environment. A contrary example, that is, an example for a conservative but nonisolated system, is an electron in eld of nuclei. 5. A physical system of free particles, that is, a system of isolated particles that do not interact with each other. An counter example is a system of isolated charged particles.
Let us consider a single elementary system that is dynamically dependent from the environment. Suppose the environment of the system consists of another N elementary systems. Suppose further that our system and the N elementary systems in the environment are initially completely separated from each other. By complete separation we mean that we have no interaction between the elementary system and the environment. We therefore consider our elementary system and the environment as the two subsystems of a larger system of N + 1 elementary systems which by our principle represent N+1 propositions. Evidently one proposition is represented by our system under consideration and N propositions are represented by the environment. Now let the initially separated subsystems interact with each other. It is then suggestive to assume that the information represented jointly by the two subsystems is conserved during the interaction process. That is, the interaction can neither increase the total amount of information represented jointly by the two subsystems nor reduce it. After the interaction, the total information of the two subsystems must still be represented by N + 1 propositions. Either it will still be represented by the two subsystems individually in a way that one proposition is represented by our system and N propositions are represented by its environment, or it will be represented by the two subsystems in a joint way. In the latter case we have information exchange between our system and
95
the environment during the interaction and this may result in a decrease of the total amount of information represented by the system. In the extreme case our system may even carry no information on its own. We represent the state of a system at time t0 by the catalog of all our information i(t0 ) = (i1 (t0 ), ..., im (t0 )) of a complete set of m mutually complementary observations. If the system is not dynamically independent from the environment then we cannot formulate the time evolution of the system alone (independent of the environment) but we have to consider it as a subsystem of a larger system that is dynamically independent. For a system dynamically independent from the environment and not exposed to measurements, there is no information exchange with the environment and the total information content of the system is conserved, i.e.
m m
Itotal (t) =
i=1
Ii (t) =
i=1
(3.23)
Here, we calculate the total information content of the system at an initial time t0 and some later time t, summarizing the individual measures of information (Eq. 1.31) over a complete set of m mutually complementary observations at the two times. We therefore obtain an ultimate constant of the motion independent of the strength, time dependence or resonance character of the external eld of the system. The conservation of the total information content of the system corresponds to the conservation of the length of the information vector during its motion in the information space. This is possible if the information vector rotates in the space of information (t, t0 )i(t0 ). i(t) = R (3.24)
Equation (3.24) expresses our expectation that the evolution law is linear in the space of information, that is, independent of the actual information vector transformed. See Sec. 3.6 for a detailed discussion and for the relation between linearity of evolution law and the no-superluminal-signaling requirement. We take a derivative of Eq. (3.24) in time and nd (t, t0 ) di(t) dR (t, t0 )i(t) = i(t0 ) = K dt dt (t, t0 ) = where K
(t,t0 ) T dR (t, t0 ). R dt
(3.25)
96 cause
T T T T (t, t0 ) = R (t, t0 ) dR (t, t0 ) = R (t, t0 ) lim R (t + t, t0 ) R (t, t0 ) K t0 dt t (t + t, t0 ) (t, t0 ) R R (t, t0 ) lim R T (t, t0 ) T (t + t, t0 ) = R R t0 t (t, t0 ) R (t + t, t0 ) R (t, t0 ). T (t, t0 ) = K = lim R t0 t
If we constrain our consideration to an elementary system and an associated three-dimensional information space, we may uniquely associate the vector of by the relation6 rotation u with any antisymmetric operator K =uy Ky for all y, (3.26)
where denotes vector product. We now rewrite Eq. (3.25) as di(t) = u(t, t0 ) i(t). dt
(3.27)
We formulate the evolution of a quantum state in time as an evolution of the catalog of our knowledge of the system. In the space of information this is described by Eq. (3.27) as a rotation of a single information vector around the axis u(t, t0 ) that itself changes in the course of time. The Eq. (3.27) might be seen as a formulation of the dynamical law for information. It describes how individual measures of information for a complete set of mutually complementary propositions evolve in time. Based on our known features of quantum physics, we will now argue for the validity of Eq. (3.27). Suppose that the quantum state of the system is
For an elementary system and the associated three-dimensional information space, the will be represented by an antisymmetric matrix operator K 0 = k21 K k31
6
k21 0 k32
k31 k32 . 0
We may now read out the components of the vector of rotation u as u1 = k32 , u2 = k31 , u3 = k21 .
97
described by the density matrix . We decompose the density matrix into the unity operator and the generators of SU(2) algebra (Pauli matrices) 1 1 3 (t) = I ij (t) j , + 2 2 j =1
(3.28)
where j is spin operator along the direction j (j = x, y, z ) and ij (t) = T r (t) j is information about the spin along the direction j at time t. If we take a derivative of Eq. (3.28) in time we obtain d (t) 1 3 ij (t) = j . dt 2 j =1 dt
i h
(3.29)
i h
k . ijk ui (t)ij
(3.30)
3 k , k =1 ijk
we proceed with
i h
(3.31)
(t) such that Introducing the operator H (t)i ), ui (t) := T r (H we nally obtain the well-known Liovilles equation d (t) (t), = [H (t)]. dt (3.32)
i h
(3.33)
For a special case of a conservative system, the evolution of a quantum state in time is constrained by a higher constant of motion, namely our information about the energy of the system, apart from that of the total information content of the system. In the space of information this corresponds to the rotation of the information vector around a xed axis that is associated to our knowledge of energy of the system. This is only possible if the axis u in Eq. (3.27) is the xed axis in time around which the information vector rotates. The component of the
98
iE (t)
Figure 3.11: One complete rotation of the information vector after a time elapse of the wave-period T . Projection of the information vector i(t) onto the axis u gives information iE about the energy of the system. information vector that remains conserved in time and therefore corresponds to the information iE about the energy of the system is obtained by the projection of the information vector onto the xed rotation axis u (Fig. 3.11), i.e. ui . u
iE =
(3.34)
We emphasize that by information about the energy of the system we mean our knowledge about the truth value of the proposition PE : The energy of the system is E1 that can be veried directly by an appropriately designed experiment. Since we consider elementary systems, that is, two-level systems having only two energy eigenvalues E1 and E2 , the truth value of the proposition The energy of the system is E1 is always a negation of the truth value of the proposition The energy of the system is E2 . Rotation of the information vector around a xed axis implies the existence of the minimal interval of time, the information vector needs to make one complete rotation in the space of information (Fig. 3.11). Analogous to the deBroglie wavelength (see Sec. 3.4), the deBroglie waveperiod can be dened as the minimal time interval T after which information about any proposition P (that diers from the proposition PE concerning the energy of the system) takes the same value. Formally, the deBrogllie wavepriod is dened through the relation iP (t + T ) = iP (t) for all P = PE .
99
3.6
Assume that the catalog of our knowledge of an individual system at time t = 0 contains all our information i = (i1 , ..., im ) of a complete set of m mutually complementary observations. Suppose further that the mapping which evolves all individual values of the information from the catalog of our knowledge of the system in time is dened as gt : i gt (i). (3.35)
If the system is dynamically independent from the environment and not exposed to measurements, there will be no information exchange with the environment, and the total information content of the system (the length of the information vector) will be conserved, i.e. Itotal (t) = |gt (i)|2 = |i |2 = Itotal (0). (3.36)
We call the mapping that preserves the length of the information vectors the unitary mapping. The evolution law in quantum mechanics described by the Schr odinger equation is an example of an unitary mapping. Besides the property of unitarity, the evolution law in quantum mechanics is specied by one other important property, namely linearity gt (ai1 + bi2 ) = agt (i1 ) + bgt (i2 ) a, b R and i1 , i2 . (3.37)
This implies that the mapping describing the evolution law in quantum mechanics preserves the scalar product between two information vectors. For the information vectors i1 and i2 both given at time t = 0 and evolving respectively into gt (i1 ) and gt (i2 ) at some future time t, we have7 gt (i1 ) gt (i2 ) = i1 i2 (3.38)
in quantum mechanics. In fact, only when the mapping is both unitary and linear, the scalar product, or equivalently the angle, between two information
It can easily be obtained that i1 i2 = 2| 1 |2 |2 1 where the vectors i1 and i2 in the space of information are associated with the pure states |1 and |2 in the Hilbert space respectively. We can then interpret | 1 |2 |2 as the probability of nding the state represented by |2 as a result of a measurement at time t = 0 if the state was represented by |1 . Similarly, we can interpret | 1 (t)|2 (t) |2 as the corresponding probability at the later time t. The assumption of preserving the scalar product says that these probabilities are the same.
7
100
vectors i1 and i2 will be preserved during the evolution. Then, for i = ai1 + bi2 we obtain |gt (i)|2 = |i|2 a2 |gt (i1 )|2 +b2 |gt (i2 )|2 +2ab gt (i1 ) gt (i2 )
=|i1 |2 =|i2 |2
= a2 |i1 |2 + b2 |i2 |2 + 2ab i1 i2 gt (i1 ) gt (i2 ) = i1 i2 . This implies that initially homogeneously distributed information vectors over the whole space of information will be transformed again in homogeneously distributed nal information vectors. By homogeneously distributed information vectors we precisely mean a set of information vectors distributed over the whole information space and separated from each other by an equal solid angle. From what deeper foundations emerges the property of preserving the scalar product in the evolution law in quantum mechanics? While we know that the unitarity of the evolution law follows from the requirement of conservation of the total information of the system during the evolution, we suggest that the property of linearity follows from the no-superluminal-signaling requirement. Consider two observers, Alice and Bob, far away from each other (e.g., several thousand light years). A source of pairs of spin-1/2 particles is placed between them, e.g., halfway between Alice and Bob. The source emits continuously pairs of spin-1/2 particles in the singlet state 1 | = (|z + 1 |z 2
2
1 |z 1 |z + 2 ) = (|x+ 1 |x 2
|x 1 |x+ 2 ), (3.39)
such that particle 1 travels towards Alice and particle 2 towards Bob. Suppose that particle 2 when it comes to Bob is inuenced by an external eld in Bobs local environment and that this inuence is described by an unitary and nonlinear evolution function gt (Fig. 3.12a). We will show that this evolution allows the construction of experimental situations where the information is transferred arbitrarily fast. The maximally entangled two-particle state (3.39) represents the two-bit combination false-false of the truth values of the propositions: The two spins are equal along the z-axis and The two spins are equal along the x-axis. Because the two bits of information are exhausted in specifying spin correlations, there cannot be any information carried by the individuals, and they are described by a completely mixed state 1 = I 2
101
Figure 3.12: Two dierent mappings gt : i g (i) describing the time evolution of
the state of the system in the space of information. If the mapping is unitary and nonlinear (Fig. a) the evolution conserves the total information of the system and enables superluminal signaling. If the mapping is non-unitary and linear (Fig. b) the evolution does not conserve the total information of the system and does not allow superluminal signaling (see discussion in the text below).
or, equivalently by the zero vector i=0 in the space of information. We emphasize again that by representing the truth value false of the proposition: The two spins are equal along the z-axis we mean a statement that can be veried directly by experiment, e.g., by measuring spin of particle 1 along z to be up (down) and subsequently measuring the spin of particle 2 along the same direction to be (down) up. Now consider the following two physical situations. 1. Alice does not make any measurement on particle 1. Because there is no information exchange between particle 2 and the environment, the initial information vector i = 0 of particle 2 evolves in time to an information vector of the equal length, i.e. gt (0) = 0. This procedure is illustrated in Fig. 3.13a. 2. Alice rst measures the spin of particle 1 along z and subsequently particle 2 evolves according to the evolution function gt . Since the information content of the composite system is specied by the truth value (false) of the proposition: The two spins are equal along the z-axis, the measurement on particle 1 immediately gives the information content in the spin along z of particle 2. After the measurement on particle 1, particle 2 is specied by the truth value of the proposition: The spin of particle 2 is up along z. According to the truth value (true or false) of that
102
proposition, half of particles 2 will be described by the information vector i1 =(0,0,+1) and the other half by i2 =(0,0,-1). Because there is no information exchange between an individual particle 2 and the environment, the total information content of each individual particle 2 remains conserved in time, i.e. |gt (i1 )|2 = |i1 |2 = 1 and |gt (i2 )|2 = |i2 |2 = 1. This procedure is illustrated in Fig. 3.13b. One may easily see that the information vector associated to an ensemble of particles with fraction w1 of particles having information vector i1 and fraction w2 of particles having information vector i2 is given by w1 i1 + w2 i2 . This precisely means that the probabilities for all outcomes of all possible observations of an individual system described by i (where i can, just formally, be written as w1 i1 + w2 i2 ) are equal to those for a classical mixture of two sub-ensembles with fraction w1 having information vector i1 and fraction w2 having information vector i2 . In our example, the information vector associated with the ensemble of particles 2 immediately after the measurement of particle 1 is per1 1 formed is 1 2 i1 + 2 i2 = 0. The coecients 2 are the weighing factors introduced because a measurement on particle 1 gives each of the two possible outcomes half the time. The individual information vectors i1 and i2 evolve in the course 1 of time independently, giving 1 2 gt (i1 ) + 2 gt (i2 ) for the information vector of the whole ensemble of particles 2 at time t.
3.6 Linearity and Arbitrarily Fast Communication Now, if the evolution law gt is not linear, i.e., 1 1 1 1 gt (i1 ) + gt (i2 ) = gt (0 = i1 + i2 ) = 0, 2 2 2 2
103
(3.40)
the two information vectors of the ensemble of particles 2 resulting from the two procedures given above are not the same and can be distinguished by Bob (Fig. 3.13). Therefore by choosing whether or not to measure particle 1 (or, by choosing to measure particle 1 in two adequately chosen bases, see the example below) Alice may communicate with Bob arbitrarily fast. To illustrate this, let us briey review Gisins [Gisin, 1990, 1993] analysis of Weinbergs [Weinberg, 1989(a), 1989(b)] proposal for introducing non-linear corrections into ordinary quantum-mechanics. We consider the following nonlinear and unitary (norm preserving) evolution law dened in terms of the wave function d = 2i z z dt where z = |z | . | (3.41)
This is a case of Weinbergs proposal for the interaction of a spin-1/2 particle and an external electric quadrupole. Now suppose that the evolution gt of particle 2 in our example is exactly described by Eq. (3.41) and consider again the following two physical situations: 1. Alice rst measures the spin of particle 1 along z, and subsequently particle 2 evolves according to gt (Fig. 3.14a). 2. Alice rst measures the spin of particle 1 along the direction u lying in the x-z plane at 45 with respect to z-axis, and subsequently particle 2 evolves according to gt (Fig. 3.14b). If the spin of particle 1 along z is measured, the initial states of particle 2 with spin up and down along z are stationary according to the evolution equation (3.41). In particular, the mean value y t (= the y-component of 1 the information vector 1 2 gt (i1 ) + 2 gt (i2 ) associated to the whole ensemble of particles 2) is always zero. However, if the spin of particle 1 along the direction
104
u is measured, the initial states of particle 2 with spin up and spin down along u rotate around the z-axis with the same frequency but in opposite directions, implying that the mean value y t is not zero any more. In particular, after a time elapse of one quarter of the period, the individual spins will have the same positive value of y t (Fig. 3.14b). Therefore, by the choice of the measurement basis for particle 1 Alice may communicate with Bob arbitrarily fast. We see that the information content of a classical mixture of particles 2 with spin up and spin down along direction u changes in the course of time under the assumption of the non-linearity of the evolution law. Because there is no information exchange between an individual particle and the environment and there is also no information exchange between individual particles themselves (individual particles are assumed not to interact with each other), this change cannot be assumed to originate in the change of the information contents of individual constituents of the ensemble. Otherwise, the length of the information vectors i1 and i2 will not be conserved. We view the change of the information content of the ensemble as a consequence of the change of the distinguishability of individual constituents of the ensemble as measured by a complete set of mutually complementary observations. By distinguishability we mean a measure of how large the probabilities for outcomes of all mutually complementary experiments dier from each other for dierent subensembles
105
(each subensemble consisting of indistingushable individuals) of the ensemble. Our discussion suggests that an adequate measure of distinguishability might be the scalar product i1 i2 . However, we show that the quantum evolution law preserves distinguishability of individual constituents of the ensemble. The important result of the discussion so far is that the no-superluminal signaling principle: Information cannot be transferred arbitrarily fast, might be seen as an independent foundational principle for quantum mechanics and another requirement necessary for the derivation of the essential features of the structure of the quantum theory besides the principle of quantization of information. The linearity of quantum theory might then be possible to be derived from both the no-superluminal signaling principle, and the principle that an elementary system carries one bit of information together with the requirement that this information is conserved if there is no information exchange. This certainly deserves much more rigorous analysis than that presented here. For completeness we note that an evolution law described by the stochastic equation (in the sense that an initial pure state may evolve into the mixture of pure states), like that proposed by Ghirardi, Rimini and Weber (Ghirardi et al., 1986), is an example of a linear and non-unitary function of information vectors in the space of information description. Such an evolution does not preserve the total information content of the system and does not allow instantaneous signaling (see Fig. 3.12b and Ref. Gisin, 1989, 1990, 1993). However, such suggestions are not just interpretations but are actually real alternatives to quantum theory. In view of the extremely high precision with which the quantum theory has been experimentally conrmed, and in view of its superb mathematical beauty and symmetry, we consider the nal success of such attempts to be extremely unlikely.
106
3.7
In this section, it will be argued that identifying the quantum state of a system with the catalog of our knowledge of the system will lead to the resolution of many of the seemingly paradoxical features of quantum mechanics. The state of a quantum system changes continuously by the dynamical law (described by the Schr odinger equation) on one hand, and in a discontinuous fashion whenever the observer acquires new information about the system through the process of measurement (sometimes called as the collapse of the wave packet) on the other. The existence of two intrinsically dierent laws for the evolution of the quantum state is a standard subject for discussion of the so called measurement paradox in quantum mechanics. In a quantum measurement, we nd the system to be in one of the eigenstates of the observable dened by the measurement apparatus. A specic example is the case when we are actually considering a wave packet as being composed of a superposition of plane waves. Such a wave packet is more or less well-localized, but we can always perform a position measurement on a wave packet which is better localized than the dimension of the packet itself. This reduction of the wave packet can only be seen as ameasurement paradox if one views this change of the quantum state as a real physical process. In the extreme case it is often even related to an instant collapse of some physical wave in space. There is no basis for any such assumption. In contrast, there is never a paradox if we realize that the wave function is just an encoded mathematical representation of our knowledge of the system, (or more properly, that knowledge which is obtained by an ideal observer in an optimum experiment, the latter qualication covering the possibility that the actual experiment performed may be less than optimum due to noise, to insensitivity, or to other instrumental problems). When the state of a quantum system has a non-zero value at some position in space at some particular time, it does not mean that the system is physically present at that point, but only that our knowledge (or lack of knowledge) of the system allows the particle the possibility of being present at that point at that instant. What can be more natural than to change the representation of our knowledge if we gain new knowledge from the measurement performed on the system? When a measurement is performed, our knowledge of the system changes, and therefore its representation, the quantum state, also changes. In agreement with the new knowledge, it instantaneously changes all its components, even those which describe our knowledge in the regions of space quite distant from the site
107
of the measurement. Then no need whatsoever arises to allude to notions like superluminal or instantaneous transmission of information. This view was assumed by Schr odinger [1935] who wrote8 : Bei jeder Messung ist man gen otigt, der -Funktion (=dem Voraussagenkatalog) eine eigenartige, etwas pl otzliche Ver anderung zuzuschreiben, die von der gefundenen Mazahl abh angt und sich nicht vorhersehen l at; woraus allein schon deutlich ist, da diese zweite Art von Ver anderung der -Funktion mit ihrem regelm assigen Abrollen zwischen zwei Messungen nicht das mindeste zu tun hat. Die abrupte Ver anderung durch die Messung ... ist der interessanteste Punkt der ganzen Theorie. Es ist genau der Punkt, der den Bruch mit dem naiven Realismus verlangt. Aus diesem Grund kann man die -Funktion nicht direkt an die Stelle des Modells oder des Realdings setzen. Und zwar nicht etwa weil man einem Realding oder einem Modell nicht abrupte unvorhergesehene Anderung zumuten d urfte, sondern weil vom realistischen Standpunkt die Beobachtung ein Naturvorgang ist wie jeder andere und nicht per se eine Unterbrechung des regelm assigen Naturlaufs hervorrufen darf. A closely related position was assumed by Heisenberg who wrote in a letter to Renninger dated February 2, 1960: The act of recording, on the other hand, which leads to the reduction of the state, is not a physical, but rather, so to say, a mathematical process. With the sudden change of our knowledge also the mathematical presentation of our knowledge undergoes of course a sudden change., as translated by Jammer [1974]. In order to obtain information about the system through an observation of the measurement apparatus, we have to establish some correlations between the system and the apparatus. A measurement apparatus has to always include all the hardware necessary to actually read out the information in some way. This is usually meant to imply that there is a pointer on the apparatus with a nite set of discrete and well-distinguishable positions. Yet, it does not make any sense to talk about the quantum state of such an apparatus. An experimentalist simply has never seen a measurement apparatus in his laboratory for which he had to assume the existence of a superposition of pointer positions. For example, his computer putting down the results of his experiments permanently onto a piece of paper works perfectly well as classical machine. In order to obtain
For each measurement one is required to ascribe to the -function (=the prediction catalog) a characteristic, quite sudden change, which depends on the measurement result obtained, and so cannot be foreseen; from which alone it is already quite clear that this second kind of change of the -function has nothing whatever in common with its orderly development between two measurements. The abrupt change by measurement ... is the most interesting point of the entire theory. It is precisely the point that demands the break with naive realism. For this reason one cannot put the -function directly in place of the model or of the physical thing. And indeed not because one might never dare impute abrupt unforseen changes to a physical thing or to a model, but because in the realism point of view observation is a natural process like any other and cannot per se bring about an interruption of the orderly ow of natural events.
8
108
some knowledge of the quantum system we have to read the pointer position. In a well-dened experiment, observation of the apparatus will lead to the one well-dened answer that the pointer is found to be in a certain position and thus the system itself will be found in a well-dened state. Consider a Stern-Gerlach experiment with a spin-1/2 particle. The two photographic plates are placed behind the magnet as schematically shown in Fig (3.15a). A detection point on the upper or the lower photographic plate is associated to the spin value up and down respectively. Assume that initially the total information carried by the particle is encoded in specifying the spin along the x-axis. This information might not be complete. We may then assign to the system the information vector i< = (s, 0, 0) where s [1, 1]. Only when the information of the spin along the x-axis is complete (s = 1 or s = 1) and we actually measure the spin along the x-axis, the system will remain unchanged after the measurement is performed. Otherwise, the measurement abruptly changes the system into a new state. For example, if we measure spin along the z-axis and the lower photographic plate is hit by the particle, our knowledge of the particle will change abruptly from before measurement after measurement > i = (0, 0, 1). (s, 0, 0) = i<
(3.42)
After the measurement our knowledge about the spin along the z-direction is complete, i.e., > Itotal = 1. The spin of the particle now represents the true proposition: The spin along the z-axis is down. After a measurement the state therefore must appear to be changed in accord with the new information (of the spin along z-axis), if any, acquired about the system together with unavoidable and irrecoverable loss of complementary information (of the spin along orthogonal directions) because of the fundamental limitation on the total information content of a quantum system. Unlike a classical measurement, a quantum measurement therefore does not just add (if any) some knowledge
> < Itotal Itotal = 1 s2 ,
it changes our knowledge in a way that the total information of an elementary system does not exceed 1 bit of information (1 s2 1). Suppose that the spin of our particle was initially up along the x-axis and that in the measurement of the spin along the z-axis, we now observe spin down (the lower photographic plate is hit by the particle). Although each individual particle of an ensemble of identically prepared and identically described particles are brought to the same measurement process, we will observe completely
109
a)
0 1 0 1 0 1 00 11 0 1 0 1 0 1 0 1 1 0 0 1 0 1 1 0 0 1 11 00 00 11 00 11 0 1 00 11 00 11 00 11 0 1 0 1 0 1 11 00 00 11 00 11 0 1 0 1 11 00
b)
Figure 3.15: The observation in the Stern-Gerlach experiment using a quantummechanical description, where detection appears as nite and indivisible event - impact on one of the two photographic plates (Fig. a), and in classical wave description where detection is a result of a continuous increasing of the intensity of the whole pattern on both photographic plates simultaneously (Fig. b).
random measurement results. Can this stochasticity of individual events be reduced to the causality, the fact that for every observed dierence in individual events there are dierent causes? How do dierent outcomes emerge if all particles are equivalently described? Why is the lower photographic plate, rather than the upper one, hit in the actual individual measurement? It is just this point which seems central in many of the dicult problems of interpreting of quantum mechanics. In an individual experimental trial either the upper or the lower photographic plate will be impinged. This ... yes or no that is recorded constitutes an unsplittable bit of information [Wheeler, 1989]. We emphasize that the property of a system to carry an unsplittable bit of information might be seen as a denition of a system as a particle. Because the gain of new information always emerges as an unsplittable bit of information, a nite and discrete event (click in the detector, detection point on the photographic plate, irreversible amplication in Bohrs sense), an individual particle must end on one single photographic plate only9 (Fig. 3.15a). On the other hand, an
One may ask: Why events happen at all? Why is a photographic plate hit by a particle at all? These are justied questions. Weizs acker poses a slightly dierent question: When do events happen? He argues that ... they always must have been created as an irreversible event ... as long as no irreversible process happens, one might maintain the continuous evolution of any event without a chance of being refuted, and, therefore, without a chance of defending it ... Irreversibility is always only a highly probable feature - never absolutely certain ... A moment of occurrence of the event can only be seen macroscopically.
9
110
individual particle carries just one bit of information and this is completely exhausted in the denition of the spin along the x-axis. No information is left to also dene the spin along the z-axis, implying necessarily complete randomness in the landing of the particles on the two photographic plates. What physical consequences would imply the assumption that the formation of patterns at the photographic plates is not the result of successive discrete detection events, but a result of the continuous increasing of the intensity of the whole pattern on both photographic plates? The observed phenomenon would then be described as a classical eld rather than by the formalism where the probability for the occurrence of a specic discrete result has objective meaning. The detection process would no more be specied by the impact on one single photographic plate (Fig. 3.15a), but by the innitesimally small contribution to the darkening of both photographic plates simultaneously (Fig. 3.15b). One would then not observe stochasticity of an individual event. They would appear in a causal way. Here the classical world of classical waves, like the waves we see on the surface of oceans, naturally emerges from the assumption of innitedivisibility of the portion of information a wave can carry. We will now bring the role of the observer in a quantum measurement to the center of our discussion. In classical physics we can assume that an observation reveals some property already existing in the outside world. For example, if we look at the moon, we just nd out where it is and it is certainly safe to assume that the property of the moon to be there is independent of whether anyone looks or not. The situation is drastically dierent in quantum mechanics and it is just the very attitude of the Copenhagen interpretation to the fundamental role of observation which is a major intellectual step forward over naive classical realism. With the only exception of the system being in an eigenstate of the measured observable, a quantum measurement changes the system into one of the possible new states dened by the measurement apparatus in a fundamentally unpredictable way, and thus cannot be claimed to reveal a property existing before the measurement is performed. The reason for this is again the fact that a quantum system cannot, not even in principle, carry enough information to specify observation-independent properties corresponding to all possible measurements. It is therefore fundamentally impossible to assign to a quantum system simultaneously complementary attributes like position and momentum, or the path of the system and the position of appearance in the interference pattern in the double-slit experiment, or spin values along orthogonal directions. However, we as observers have a signicant role in the measurement process, because we can decide by choosing the measuring device which attribute will be realized in the actual measurement. For example, by orienting the magnet
111
in the Stern-Gerlach apparatus along some specic direction, we decide along which direction the spin of a particle may manifest itself. By choosing which measurement device to use we can choose the kind of information we want to gain. Since we are unavoidably constrained by the total information content of the system, by choosing which measurement device to use we not only decide what particular knowledge will be gained, but simultaneously what complementary knowledge will be lost after the measurement is performed. Here, a very subtle position was assumed by Pauli [1955] who writes: The gain of knowledge by means of an observation has as a necessary and natural consequence, the loss of some other knowledge. The observer has however the free choice, corresponding to two mutually exclusive experimental arrangements, of determining what particular knowledge is gained and what other knowledge is lost (complementary pairs of opposites). Therefore every irrevocable interference by an observation about a system alters its state, and creates a new phenomena in Bohrs sense. A very interesting and closely related position is assumed by Zeilinger [1996] who writes: Let us consider once again the impossibility of a detailed description of the individual statistical event in the sense of a fundamental unpredictability. I suggest that the fact is very important that while, by choosing the apparatus, we can dene which one of two complementary quantities may manifest itself, for example, position or momentum, we have no inuence on the value of the quantity. Therefore, as observers we have a qualifying but not a quantifying inuence on the quantum phenomenon. The latter, the impossibility of a quantifying inuence, is closely connected to the niteness of the quantum of action. In this I see a necessary consequence of the rst, the qualifying inuence, in such a way that it ensures that the observer does not have total control over the phenomena in Nature. The observer can, thus, through his experimental questioning, jostle, so to speak, Nature, depending on which arrangement is chosen, to give answers to dierent questions that exclude each other - but for the price of not being able to exert a quantifying inuence, an inuence which specic result will materialize. What is really fascinating is that we can dene which specic attribute of a particle may manifest itself at a distance, by measuring another distant particle that does not interact with the particle under consideration. By choosing which measurement device to use on the rst particle we are now free to decide what particular knowledge about a second particle, perhaps distant by thousand light-years, will be gained and simultaneously what complementary knowledge will be lost. Consider, for example, the EPR-Bohm pair of spin-1/2 particles known to be in a state of total spin momentum zero. They scatter and separate. In this state no denite proposition can be made about the spin of either particle, yet if the spin components of the two particles are measured
112
along the same direction, the outcomes will with certainty be found to be opposite. If in a measurement of the spin projection of particle 1 one obtains the value up, then the laws of quantum mechanics assert that the state of particle 2 immediately changes to one in which the spin projection onto the same direction is denite by down. The seeming paradox arises by asking how a measurement on particle 1 can change the state of particle 2 (which might be very far away from the rst) from one in which the spin is indenite to one in which the spin is denite. This can only be paradox, however, if the change of the state is regarded as a real process which can occur suddenly, for example, by the result of instantaneous interaction with another system. In the view presented here, however, a description of the state is a description of the information possessed by an observer. A measurement of the spin projection of particle 1 gives instantaneously complete information about the spin projection onto the same direction of particle 2 because of the special way the spins are correlated (entangled) in this experiment. Still, the observer cannot inuence which specic result will be observed through the measurement. This excludes any possibility of instantaneous transmission of information from one side of the measurement to another.
113
B.1
From Eq. 3.13 it follows that functions f (x) and g(x), specifying the transformation matrix, satisfy equations f (x + y ) = f (x)f (y ) g(x)g(y ) and g(x + y ) = f (x)g(y ) f (y )g(x). (B.44) (B.43)
We will show that the continuity of these functions is a sucient condition for their analyticity. In other words, if they are continuous functions, then they also have derivatives of any order. In order to prove it, we introduce two linearly independent functions 1 (x) = f (x) + ig(x) and 2 (x) = f (x) ig(x). It is easy to see that each of them satises the functional equation i (x + y ) = i (x)i (y ) i = 1, 2. Since each i (x) is continuous and i (0) = f (0) = 1, then 1 t0 t lim
t 0
(B.45)
(B.46)
i (x)dx = i (0) = 1
(B.47)
Int(s) :=
i (x)dx = 0.
(B.48)
i (y )Int(s) =
s
i (x)i (y )dx
y +s y
i (x + y )dx =
i (x)dx,
114
i (y ) =
i (x)dx
(B.49)
is its rst derivative. Now, because i is a dierentiable function, the right-hand side of Eq. (B.50) is also dierentiable. This implies that i (y ) is dierentiable, that is, i has the second derivative i (y ) = 1 ( (y + s) i (y )). Int(s) i (B.51)
This further implies the existence of the third derivative of i etc. Thus, every continuous function that satises equation (B.46) is analytical. Obviously, this is also valid for functions 1 1 f (x) = (1 (x) + 2 (x)), g(x) = (1 (x) 2 (x)). 2 2i (B.52)
115
B.2
If we dene the orientations of the three mutually orthogonal directions n1 (, , ), n2 (, , ) and n3 (, , ) in ordinary space by the Eulers angles 0 < 2 , 0 , 0 < 2 , we can represent the three mutually complementary propositions associated to the spin-1/2 measurements along these directions in terms of Eulers angles as P1 (, , ): The spin along the direction n1 (, , ) is up, P2 (, , ): The spin along the direction n2 (, , ) is up and P3 (, , ): The spin along the direction n3 (, , ) is up. Given a specic set of three orthogonal directions, all other sets of orthogonal directions can be obtained by rotating the reference set. The general rotation for Eulers angles , , can be performed as a sequence of three rotations, the rst around the z-axes by 0 < 2 , the second around the y-axes by 0 and the third again around the z-axes by 0 < 2 . A list of mutually complementary propositions associated to the spin measurements along directions obtained by the rst rotation can be represented as P1 (0, 0, ), P2 (0, 0, ), P3 (0, 0, ). Now, applying a similar argumentation as in Sec. 3.3 one obtains cos sin 0 R( ) = sin cos 0 0 0 1
(B.53)
for the transformation matrix in the space of information10 . If we x the angle of the rst rotation at o and consider only propositions P1 (0, , o ), P2 (0, , o ) and P3 (0, , o ) about spins along directions obtained by the second rotation around the y-axis for an angle 0 , the corresponding transformation matrix reads cos 0 sin ( ) = R 0 1 0 . sin 0 cos
(B.54)
In the last step we x both the angle o of the rst rotation and the angle
10 One should always keep in mind the dierence between directions along which mutually complementary measurements are performed in ordinary space (such as the vertical direction and the direction at +45 along which a photons polarization is measured, or orthogonal directions along which the particles spin component are measured) and directions associated with mutually complementary propositions (components of an information vector) in the space of information. The latter always constitute an orthogonal coordinate system.
116
o of the second rotation, and consider only sets of mutually complementary propositions P1 (, o , o ), P2 (, o , o ) and P3 (, o , o ) about spins along directions obtained by the third rotation around the z-axis for 0 < 2 . The corresponding transformation matrix is again of the form (B.53) with the angle . Finally, the transformation matrix for a general rotation in the space of information is given as )R( )R( ) (, , ) = R( R
=
cos cos cos sin sin sin cos cos + cos sin sin cos
cos cos sin cos sin + sin cos sin cos sin sin sin + cos cos sin sin cos
(B.55)
We note that the transformation matrix may also be dened in terms of some general physical parameters u = (u1 , u2 , u3 ) and
3 jkl ul , l=1
(B.56)
where jkl is total antisymmetric tensor. This then describes a rotation around the unity direction (u1 , u2 , u3 ) by an angle in the space of information.
Conclusions
The laws we discover about Nature do not already exist as Laws of Nature in the outside world. Rather Laws of Nature are necessities of the mind for any possibility to make sense whatsoever out of the data of experience. This epistemological structure is a necessity behind the form of all laws an observer can discover. As von Weizs acker has put it, and Heisenberg quoted in [1958]: Nature is earlier than man, but man is earlier than natural science. An observer is inescapably suspended in the situation of collecting the data of observation, formatting concepts of Nature therefrom, and predicting future data. In observing, she/he is able to distinguish only two results at each interval of time. Therefore the experience of the ultimate experimenter is a stream of yes or no answers to the questions posed to Nature. Every concept of Nature in the last analysis can be based on binary questions. Even the concept of the system itself is an useful construct we introduce as possessing a property of giving a denite answer to the yes-no questions that are posed through observations. We may further introduce the most elementary system as a system which gives a denite answer to one single binary question only. Answers of an elementary system to other dierent questions must then necessarily contain an element of randomness. A precise answer to one specic question excludes therefore any possibility for an elementary system to provide a denite answer to complementary questions. Without any additional physical structure assumed we thus let the fundamental features of quantum mechanics, namely the irreducible randomness of an individual event and complementarity, be a consequence of a very natural principle that an elementary system gives a denite answer to a single experimental question only. The principle of quantization of information that an elementary system gives the denite answer (yes or no) to one binary question only, implies that it is pointless to look for a cause of the individual event. This is conrmed through theorems like those of Bell [Bell, 1964] and Greenberger-Horne-Zeilinger [Greenberger et al., 1989, 1990], which state that randomness of an individual quantum event can not be derived from the introduced local causes (local hidden
117
118
Conclusions
variables). Quantum mechanics is not able to explain why (specic) events happen as pointed out by Bell [1990]. Any more detailed description of the reality that would be able to give an unambiguous answer to Bells question, that is, any description that would be able to arrive at an accurate and detailed prediction of the particular process resulting in a particular event, will necessarily include the denition of a number of hidden properties of the system which would carry information as to which specic result will be observed for all possible future measurements. To Bells question why specic events happen, no answer can therefore be given, because if we could give an answer it would mean that a quantum system carries enough information to provide denite answers to all questions that could be asked experimentally which is forbidden by the principle of quantization of information. Any concept of an existing reality is a mental construction based on observations. Yet this does not imply that reality is no more than a pure subjective human construct. From our observations we are able to build up objects with a set of properties that do not change under variations of modes of observation and description. These are invariants with respect to these variations. Predictions based on any such specic invariant objects may then be checked by anyone, and the validity of the concepts constructed should not be restricted to phenomena taking place in some well-dened experimental context. As a result we may arrive at an intersubjective agreement about the model, thus lending a sense of independent reality to the mentally constructed objects. In quantum experiments an observer may decide to measure a dierent set of complementary variables, thus gaining certainty about one or more variable at the expense of losing certainty about the other(s). The total uncertainty, or equivalently, the total information carried by the system, is invariant under such transformation from one complete set of complementary variables to another. While in a classical world view a property of a system is a primary concept prior to and independent of observation and information is a secondary concept which measures our ignorance about properties of the system, in the view of quantum mechanics the notion of the total information of the system emerges as a primary concept, independent of the particular complete set of complementary experimental procedures the observer might choose. A property of the system becomes a secondary concept, a specic representation of the information of the system that is created spontaneously in the measurement itself11 .
It was clearly stated in the papers of Bohr and Heisenberg that information may serve as a guiding concept in a search for deeper understanding of reality. We quote Bohr as writing [1934]: ... a subsequent measurement to a certain degree deprives the information given by a previous measurement of its signicance for predicting the future course of phenomena. Obviously, these facts not only set a limit to the extent of the information obtainable by measurement, but they also set a limit to the meaning which we may attribute to such information. We meet here in a new light the old truth that in our description of nature
11
Conclusions
119
Quantum theory supplies a set of rules how physical conditions of an experimental arrangement determine the probabilities for dierent possible results of the experiment. The origin of these rules does not seem to be clear. But such is necessary if we want to be able to understand how we can know what physical conditions we prepared in an experiment from which in turn we can calculate the probabilities for dierent results. From what deeper foundation emerges the familiar sinusoidal relation between the probabilities and the laboratory parameters? In this work we suggested to dene the total information content of a quantum system as a summation of individual measures of information over a complete set of mutually complementary observations. Assumption of an invariance of the total information content of the system under the choice of a particular set of mutually complementary observations (the total knowledge of the system is invariant under a change of representation of the catalog of our knowledge about the system) together with the assumption of the homogeneity of the laboratory parametric axis then necessarily leads to the sinusoidal relation between probabilities and labaratory parameters in quantum mechanics, without any input from quantum theory. Why is the mathematical representation of the knowledge of the system in quantum theory characterized by complex quantities which are very remote from our knowledge? We gain some insight into this if we consider what quantum mechanics looks like when it is not expressed in terms of complex probability amplitudes. If the origin of the structure of quantum mechanics is to be sought in a theory of observations, of observers, and of meaning, then we would do well to focus our attention not on amplitudes, but on quantities which are more directly observable. After all, quantum phenomena do not occur in a Hilbert space. They occur in a laboratory. In the present work we obtained all of the essential features of quantum mechanics in terms of knowledge, i.e. information, of the system, which is a directly observable quantity. We dene the total information content of a system as an invariant found in dierent sets of mutually complementary observations. It turns out that the lowest symmetry common for all elementary systems is the invariance of their information content with respect to a rotation in a threedimensional space. This seems to justify the use of three-dimensional space as the space of the inferred world. At this point the dimensionality of our space appears to be directly related to the lowest possible number of mutually exclusive questions we may pose to an elementary system. The Hilbert-space
the purpose is not to disclose the real essence of the phenomena but only to track down. so far as it possible, relations between the manifold aspects of our experience. and Heisenberg [1958]: The laws of nature which we formulate mathematically in quantum theory deal no longer with the particles themselves but with our knowledge of the elementary particles. ... The conception of objective reality ... evaporated into the ... mathematics that represents no longer the behavior of elementary particles but rather our knowledge of this behavior.
120
Conclusions
structure is implicitly contained in and can easily be revealed from the structure specied by the space of information12 . The complexity of the probability amplitudes is again a necessity of the fact that the lowest number of mutually complementary observations is three. This may easily be seen in the theory of a spin-1/2 particle, where the state of a particle may always be represented by real probability amplitudes if we restrict our consideration to two-dimensions. In search of a deeper understanding of quantum mechanics we are at the beginning, not at the end13 . We hope we have made here a further step in fullling Wheelers [1989] program: It from bit. Otherwise put, every it every particle, every eld or force, even the spacetime continuum itself derives its function, its meaning, its very existence entirely even if in some contexts indirectly from the apparatus elicited answers to yes or no questions, binary choices, bits.
This is a consequence of the fact that the group of rotation SO(3) in three-dimensional real space is isomorph to the group of rotation U (2) in a two-dimensional complex (Hilbert) space. 13 Bohr [1935] writes in his famous answer to Einstein-Podolsky-Rosen [1935] paper: In fact it is only the mutual exclusion of any two experimental procedures, permitting the unambiguous denition of complementary physical quantities, which provides room for new physical laws, the coexistence of which might at rst sight appear irreconcilable with the basic principles of science. It is just this entirely new situation at regards the description of physical phenomena, that the notion of complementary aims at characterizing.
12
121
Brukner and Anton Zeilinger C. Operationally Invariant Quantum Information Phys. Rev. Lett. (in press)
122
In any individual quantum measurement with discrete variables a number of dierent outcomes are possible, for example in a spin-1/2 measurement the individual outcomes spin up and spin down. We dene a new measure of information for an individual quantum measurement based on the fact that the only feature dened before the measurement is performed are the specic probabilities for all possible individual outcomes. The observer is free to choose dierent experiments which might even completely exclude each other, for example measurements of orthogonal components of spin. This quantum complementarity of variables occurs when the corresponding operators do not commute. One quantity, for example the z-component of spin, might be well dened at the expense of maximal uncertainty about the other orthogonal components. We dene the total information content in a quantum system to be the sum over all individual measures for a complete set of mutually complementary experiments. The experimentalist may decide to measure a dierent set of complementary variables thus gaining certainty about one or more variables at the expense of loosing certainty about other(s). In the case of spin this could be the projections along rotated directions, for example where the uncertainty in one component is reduced but the one in another component is increased correspondingly. Intuitively one expects that the total uncertainty or, equivalently, the total information carried by the system is invariant under such transformation from one complete set of complementary variables to another one. We show that the total information dened according to our new measure has exactly that invariance property. Also it is conserved in time if there is no information exchange with an environment. We nd that the total information of a system results in k bits of information for a system consisting of k qubits. For a composite system, maximal entanglement results if the total information carried by the system is exhausted in specifying joint properties, with no indi-
vidual qubit carrying any information on its own. Our results we interpret as implying that information is the most fundamental notion in quantum mechanics. Every reasonably well-designed experiment tests some proposition. Knowledge of the state of a quantum system permits the prediction of individual outcomes with certainty only for that limited class of experiments which have denite outcomes, a situation where the corresponding propositions have denite truth values. From theorems like Kochen-Specker [1] we know that in quantum mechanics it is not possible, not even in principle, to assign denite noncontextual truth values to all conceivable propositions. About indenite propositions we can only make probabilistic predictions. Consider a stationary experimental arrangement with n possible outcomes. Knowing the probabilities p = (p1 , ..., pj , ..., pn ) for the outcomes all an experimenter can predict is how many times on average a specic outcome will occur. In making his prediction he has only a limited number of systems to work with. Then, because of the statistical uctuations associated with any nite number of experimental trials, the number nj of occurrences of a specic outcome j , in future N repetitions of the experiment is not precisely predictable. Rather, the experimenters uncertainty (mean-square-deviation), or lack of information, in the value nj is [2]
2 j = pj (1 pj )N.
(1)
This implies that for a suciently large number N of experimental trials the condence interval is given as (pj N j , pj N +j ). Therefore, if we just plan to perform the experiment N times, we know in advance, before the experiments are performed and their outcomes become known, that the number nj of future occurrences of the outcome j will be found with probability 68% within the condence interval. Notice that the experimenters lack of information (1) is proportional to the number of trials. This important property guarantees that each individual performance 1
123
of the experiment contributes the same amount of information, no matter how many times the experiment has already been performed. After each trial the experimenters lack of information about the outcome j therefore decreases by
2 j U (pj ) = = pj (1 pj ). N
they exhaust all denable knowledge about the object concerned and suggest to sum the individual measures of information (Eq. 4) over a complete set of m of mutually complementary observables
m
Itotal = (2)
j =1
Ij (p).
(5)
This is the lack of information about the outcome j with respect to a single future experimental trial. In this view we suggest to dene the total lack of information regarding all n possible experimental outcomes as
n n n
U (p) =
j
U (pj ) =
j
pj (1 pj ) = 1
j
p2 j.
(3)
A set of propositions associated to certain quantummechanical experiments is mutually complementary if complete knowledge of the truth value of any one of the propositions implies maximal uncertainty about the truth values of the others. Such a complete set of propositions for a spin-1/2 particle can be for example: The spin along the x-axis is up, The spin along the y -axis is up and The spin along the z -axis is up.
The uncertainty is minimal if one probability is equal to one and it is maximal if all probabilities are equal. This suggests that the knowledge, or information, with respect to a single future experimental trial an experimentalist possesses before the experiment is performed is somehow the complement of U (p) and, furthermore, that it is a function of a sum of the squares of probabilities. A rst ansatz therefore would be I (p) = 1 U (p) = n 2 i=1 pi . Expressions of such a general type were studied in detail by Hardy, Littlewood and P olya [3]. Notice that this expression can also be viewed as describing the length of the probability vector p. Obviously, because of i pi = 1, not all vectors in probability space are possible. Indeed, the minimum length of p is given when all pi are equal (pi = 1/n). This corresponds to the situation of complete lack of information in an experiment. Therefore we suggest to normalize the measure of information in an individual quantum measurement as obtaining nally
n
FIG. 1. Principle sketch of arrangements to consider mutually exclusive classes of information in an interference experiment with a Mach-Zehnder type of interferometer. Into the each of two paths of the interferometer in Fig. 1b one detector is inserted with a property to detect the particle without absorbing it.
I (p) = N
i=1
pi
1 n
(4)
Considering from now on those cases where maximally k bits of information can be encoded, i.e. n = 2k , the normalization is N = 2k k/(2k 1). Then I (p) results in k bits of information if one pi = 1 and it results in 0 bits of information when all pi are equal. We emphasize that our measure of information is not equal to Shannons information. While Shannons information is applicable when measurement reveals a preexisting property [4], our measure of information takes into account that, in general, a quantum measurement does not reveal a pre-existing property. Having dened the information content in an individual quantum measurement we now ask what the total information content in a quantum system is. We recall Bohrs [5] remark that ... phenomena under dierent experimental conditions, must be termed complementary in the sense that each is well dened and that together
We now analyze the mutually complementarity propositions in an interference experiment with an idealized Mach-Zehnder type of interferometer (Fig. 1). Suppose that in the presence of a specic phase shift between two beams inside the interferometer (Fig. 1a), the particle will exit with certainty towards the upper (lower) detector behind the beam splitter. In this case we have complete knowledge of the beam the particle will be found in behind the beam splitter at the expense of the fact that we have absolutely no knowledge which path the particle took inside the interferometer. The state of the particle is then represented by the truth value (true or false) of the proposition: (1) The particle takes the outgoing path towards the upper detector in presence of the phase shift . In contrast, if we know which path the particle took through the interferometer (Fig. 1b) no interference results and hence it is completely uncertain which outgoing path the particle will take. The state of the particle can now be specied by the truth value of the proposition:
124
(2) The particle takes the upper path inside the interferometer. Knowing that spin-1/2 aords a model of the quantum mechanics of all two-state systems, i.e. qubits, we expect that there are always three mutually complementary propositions whenever binary alternatives are considered. Indeed, it can easily be shown that even without path information our knowledge of the beam the particle will be found in behind the beam splitter in Fig. 1a will be completely removed if we introduce an additional phase shift of /2 between the two beams inside the interferometer. Then, in the new arrangement in Fig. 1c both outgoing beams will be equally probable. Now, suppose that in the presence of a specic phase shift + /2 (Fig. 1c), the particle will exit with certainty towards the upper (lower) detector. The state of the system is now represented by the truth value of the proposition (3) The particle takes the outgoing path towards the upper detector in presence of the phase shift + /2. For a particle in that state we have complete knowledge of the outgoing beam the particle will take (Fig. 1c) at the expense of absolutely no knowledge neither about the path inside the interferometer (Fig. 1b) nor about the outgoing path in the arrangement in Fig. 1a. Notice that we can label various sets of the 3 mutually complementary propositions by the value of the phase shift. The 3 propositions we found for the interferometer are formally equivalent to the complementary propositions about spin-1/2: (1)The spin is up along in the x-y plane, (2) The spin is up along the z-axis, and (3) The spin is up along + /2 in the x-y plane. Here, the direction is assumed to be by lying in the x-y plane oriented at an angle with respect to the xaxis. Evidently, this analogy can even be carried further using the concept of multiports. Therefore from now on we will explicitly discuss spin measurements only keeping in mind the applicability of these ideas for interference experiments. We realize that the total information content of the system is
+ + Itotal = I1 (p+ 2 1. 1 , p1 )+ I2 (p2 , p2 )+ I3 (p3 , p3 ) = 2T r
(6) Here, e.g., p+ 1 is the probability to nd the particle in the state with spin up along . Evidently, this is invariant under unitary transformations. Also, this results in just 1 bit of information for a pure state when 1 single proposition with denite truth value is assigned to the system and in 0 bits of information for a completely mixed state when no proposition with denite truth value can be made about the system. Note that the total information content of a quantum system is completely specied by the state of the system alone and independent of the physical parameter
(phase shift) that labels various sets of mutually complementary observations. In the same spirit as choosing a coordinate system one may choose any set of mutually complementary propositions to represent our knowledge of the system. The total information about the system will then be invariant under that choice. This is the reason we may use the phrase the total information content without explicitly specifying the particular reference set of mutually complementary propositions. Also note that the total information content of the system is conserved in time if there is no information exchange with the environment, that is, if the system is dynamically independent from the environment and not exposed to a measurement. Wootters and Zurek [6] found for a double-slit experiment that we can obtain some partial knowledge about the particles path and still observe an interference pattern of reduced contrast as compared to the ideal interference situation. Englert [7] has proposed an inequality to describe quantitatively the complementarity between path information and interference pattern in a MachZehnder type of interferometer. Our results indicate that we have to take into account not just these two variables, but three. Then the rigorous equality Eq. (6) results. In order to analyze the most simple composite system in view of the ideas just proposed above, let us consider two qubits. An explicit example will again be two spin1/2 particles. We will consider a set of mutually complementary pairs of propositions where precise knowledge of the truth values of a specic pair of propositions excludes any knowledge of the truth values of other complementary pairs of propositions. As opposed to the singleparticle case where 3 individual propositions are complementary to each other, in the two-particle case we have 5 pairs of propositions where each pair is complementary to each other pair [8]. We give one possible choice of a complete set of pairs of complementary propositions for two particles: (1) The spin of particle 1 is up along z and The spin of particle 2 is up along z; (2) The spin of particle 1 is up along 1 and The spin of particle 2 is up along 2 , (3) The spin of particle 1 is up along 1 + /2 and The spin of particle 2 is up along 2 + /2, (4) The spin of particle 1 along z and the spin of particle 2 along 2 are the same and The spin of particle 1 along 1 and the spin of particle 2 along 2 + /2 are the same, (5) The spin of particle 1 along z and the spin of particle 2 along 2 + /2 are the same and The spin of particle 1 along 1 and the spin of particle 2 along z are the same. Again directions 1 and 2 are assumed both to be by lying in the x-y plane oriented at an angle 1 and 2 respectively with respect to the x-axis. In a set of mutually exclusive two-particle interference experiments the angles 1 and 2 would correspond to phase shifts in two Mach-Zehnder interferometers fed by two particles. We nd for the total information carried by the com-
125
posite system
5
Itotal =
j =1
Ij (pj ) =
2 (4T r 2 1). 3
(7)
j j j Here, pj = (pj 1 , p2 , p3 , p4 ) are the probabilities for the system in the state to give the four possible combinations (true-true, true-false, false-true and false-false) of the truth values for the pair of propositions j . This again is invariant under unitary transformations. Independence on physical parameters 1 and 2 implies that the total information of the composite system is invariant under the choice of the particular set of mutually complementary pairs of propositions. Also the total information of the composite system is conserved in time if there is no information exchange between the composite system and an environment. We note that these results can be generalized to a composite system consisting of k qubits. A composite 2-qubits system in a pure state carries 2 bits of information. That information contained in 2 propositions can be distributed over the 2 particles in various ways. It may be carried by the 2 particles individually, e.g., as the two-bit combination false-true of the truth values of the propositions given in (1). This is then represented by the product state | prod = |z 1 |z + 2 . The 2 bits of information are thus encoded in the two particles separately, one bit in each particle just like in classical physics. In that case there is no additional information represented jointly by the 2 systems. Alternatively, 2 bits of information might all be carried by the 2 particles in a joint way, in the extreme with no individual particle carrying any information on its own. For example, this could be the two-bit combination truefalse of the truth values of the propositions given in (4). Again, this is represented by the entangled state
ent
(8)
For clarity we emphasize that our total information content of a quantum system is neither mathematically nor conceptually equivalent to von Neumanns entropy. With the only exception for results of measurement in a basis decomposing the density matrix into a classical mixture when it can be considered as equivalent to Shannons information [4], the von Neumann entropy is just a measure of the purity of the given density matrix without explicit reference to information contained in individual measurements. In contrary, our information content is purely operational and it refers directly to experimental results of mutually complementary measurements thus including also those for which the density matrix cannot be decomposed into a classical mixture. Our information content of the system can be viewed as equivalent to the sum of partial knowledges an experimentalist can have about mutually exclusive measurements without any further reference to the structure of the theory. In the present paper we nd an operational quantum information invariant that reects the intrinsic symmetry of the underlying Hilbert space of the system. We interpret our result as implying that number of essential features of quantum mechanics, might be based on the observation [10,11] that the most elementary system represents the truth value of one proposition only. Since this is the only information a quantum system carries, a measurement associated with any other proposition must necessarily contain an element of irreducible randomness. This kind of randomness must then be irreducible, that is, it cannot be reduced to hidden properties of the system. Otherwise the system would carry more information than necessary to specify one denite proposition. Entanglement results from the fact that information could also be distributed in joint properties of a multiparticle system. In particular, maximal entanglement arises when the total information of a composite system is exhausted in specifying joint properties. This work have been supported by Austrian Science Foundation FWF, Project No. S6503.
where |x+ , |x and |y + , |y represent the eigenbases of spin rotated by 1 and 2 respectively. This Bell state does not contain any information about the individuals, all information is contained in joint properties. In fact, now there cannot be any information carried by the individuals because the two bits of information are exhausted by dening that maximally entangled state, and no further possibility exists to also encode information in individuals. This we see as a quantitative formulation of Schr odingers [9] idea that If two separated bodies, each by itself known maximally, enter a situation in which they inuence each other, and separate again, then ... the knowledge remains maximal, but at its end, if the two bodies have again separated, it is not again split into a logical sum of knowledges about the individual bodies.
[1] S. Kochen and E. P. Specker, J. Math. and Mech. 17, 59 (1967). [2] B. V. Gnedenko, The Theory of Probability (Mir Publishers, Moscow, 1976). [3] G. Hardy, J. E. Littlewood and G. P olya, Inequalities (Cambridge University Press, Cambridge, 1952). Brukner and A. Zeilinger, Conceptual Inadequancy of [4] C. the Shannon Information in Quantum Measurement (in preparation). [5] N. Bohr: Atomic Physics and Human Knowledge (Wiley, New York, 1958). [6] W. K. Wootters and W. H. Zurek, Phys. Rev. D 19, 473 (1979).
126
[7] B. G. Englert, Phys. Rev. Lett. 77, 2154 (1996). [8] For n = 2k there are 2k + 1 mutually complementary observables. See W. K. Wootters and B. D. Fields, Ann. of Phys. 191, 363 (1989). [9] E. Schr odinger, Naturwissenschaften 23, 807 (1935); See also: www.emr.hibu.no/lars/eng/cat [10] A. Zeilinger, Found. Phys. 29, 631 (1999). Brukner and A. Zeilinger, Act. Phys. Slov. 89, 647 [11] C. (1999).
References
Ballentine L. E., 1970, Reviews of Modern Physics 42, 358. Barenco A., D. Deutsch, A. Ekert and R. Josza, 1995(a), Phys. Rev. Lett. 74, 4083. Barenco A., C. H. Bennett, R. Cleve, D. P. DiVincenzo, N. Margolous, P. Shor, T. Sleator, J. Smolin and H. Weinfurter, 1995(b), Phys. Rev A 52, 3457. Bell J. S., 1964, Physics (Long Island City, N.Y.) 1, 195. Bell J. S., 1990 August, Physics World 33. Bennett C. H., G. Brassard, and A. K. Ekert, 1992 October, Scientic American, 50. Bennett C. H., G. Brassard, C. Crepeau, J. Jozsa, A. Peres and W. K. Wootters, 1993, Phys. Rev. Lett. 70, 1895. Bohm D., 1952, Phys. Rev. 85, 166. Bohr N., 1928, Nature 121, 580. Bohr N., 1935, Phys. Rev. 48, 696. Bohr N., 1949, in Albert Einstein: Philosopher-Scientist, edited by P.A. Schillp (The Library of Living Philosophers Evanston, IL) 200. A copy can be found at the web site http://www.emr.hibu.no/lars/eng/schlipp/Default.html Bohr N., 1958, Atomic Physics and Human Knowledge, (Wiley, New York). Bouwmeester D., J. W. Pan, K. Mattle, M. Eibl, H. Weinfurter and A. Zeilinger, 1997, Nature 390, 575.
127
128
References
and A. Zeilinger, 1999(a), Operationally Invariant Quantum InforBrukner C mation, Phys. Rev. Lett. (in press). and A. Zeilinger, 1999(b), Act. Phys. Slov. vol. 89, No. 4, 647. Brukner C and A. Zeilinger, 1999(c), in Experimental and Epistemological FounBrukner C dations of Quantum Mechanics, edited by D. M. Greenberger, W. Reiter and A. Zeilinger (Vienna Circle Yearbook, Kluwer), in press. and A. Zeilinger, 1999(d), Information Content of an Elementary Brukner C System and the Foundations of Quantum Physics in Proceedings of 14th International Conference on Laser Spectroscopy in Innsbruck (World Scientic). Bruss D., A. Ekert, S. F. Huelga, J.-W. Pan and A. Zeilinger, 1997, Phil. Trans. R. Soc. Lond. A 355, 2259. Bu zek V. and M. Hillery, 1996, Phys. Rev. A 54, 1844. Clauser J. F., M. A. Horne, A. Shymony and R. A. Holt, 1969, Phys. Rev. Lett. 23, 880. Cramer J. C., 1986, Rev. Mod. Phys. 58, 647. A copy can be found at the web site www.mist.npl.washington.edu/tiqm Einstein A., B. Podolsky and N. Rosen, 1935, Phys. Rev. 47, 777. Englert B. G., 1996, Phys. Rev. Lett. 77, 2154. Everett H., 1957, Rev. Mod. Phys. 29, 454. Faddeev D. K., 1957, in Arbeiten zur Informations theorie I, edited by H. Grell (Deutscher Verlag der Wissenschaften, Berlin), 88. Russian original in Uspekhi Mat. Nauk., 11 (1956) 227. Feinstein A, 1958, Foundation of Information Theory (McGraw-Hill, N.Y.), 17. Feynman R. P., R. B. Reighton and M. Sands, 1965, The Feynman Lectures of Physics (Reading Massachusetts, Addison-Wesley) vol III. Feynman R. P., 1967, The Character of Physical Law (MIT Press, Cambridge Massachusetts). Fisher R. A., 1925, Proc. Camb. Phil. Soc., 22, 700. Reprinted in R. A.
References Fisher, Contributions to Mathematical Statistics (Wiley, N.Y., 1950). Fivel D. I., 1994, Phys. Rev. A 59, 2108.
129
Gelfand I.M. and A. M. Yaglom, 1957, in Arbeiten zur Informations Theorie II, edited by H. Grell (Deutscher Verlag der Wissenschaften, Berlin), 7. Russian original in Uspekhi Mat. Nauk., 11 (1957) 3. Ghirardi G. C., A. Rimini, and T. Weber, 1986, Phys. Rev. D 34, 470. Gisin N., 1989, Helv. Phys. Act. 62, 363. Gisin N., 1990, Phys. Lett. A 143, 1. Gisin N., 1993, Am. J. Phys. 61, 86. Gnedenko B. V., 1976, The Theory of Probability (Mir Publishers, Moscow). Grandy W. T., Jr., 1997, Am. J. Phys. 65(6), 466. Greenberger D. M. and A. Yasin, 1988, Phys. Lett. A 128, 391. Greenberger D. M., M. Horne and A. Zeilinger, 1989, in Bells Theorem, Quantum Theory and Conceptions of the Universe, edited by M. Kafatos, (Kluwer Academic, Dordrecht). Greenberger D. M., M. Horne, A. Shimony and A. Zeilinger, 1990, Am. J. Phys. 58, 1131. Griths R. B., 1984, J. Stat. Phys. 36(12), 219. Hardy G., J. E. Littlewood and G. P olya, 1952 Inequalities (Cambridge University Press, Cambridge). Heisenberg W., 1958, Daedalus 87, 95. Heisenberg W., 1958, Physics and Philosophy, Chapter 3. Herzog T. J., P. G. Kwiat, H. Weinfurter and A. Zeilinger, 1995, Phys. Rev. Lett. 75, 3034. Horne M. A., A. Shimony and A. Zeilinger, 1989, Phys. Rev. Lett. 62, 2209.
References
Jaeger G., A. Shimony and L. Vaidman, 1995, Phys. Rev. A 51, 54. Jammer M., 1966, The Conceptual Development of Quantum Mechanics, (McGrawHill, New York). Jammer, M., 1974, The Philosophy of Quantum Mechanics, (J. Wiley & Sons, New York). Jaynes E.T., 1957, Phys. Rev. 106, 622. Jaynes E. T., 1962, Information Theory in Statistical Physics, Bradeis Summer Institute (W.A. Benjamin inc, New York). Kochen S. and E. P. Specker, 1967, J. Math. and Mech. 17, 59. Kullback S., 1959, Information Theory and Statistics (Wiley, N.Y.). Lahti P. J., P. Busch and P. Mittelstaedt, 1991, J. Math. Phys. 32, 2770. Landauer R., 1991 May, Physics Today, 23. L evy-Leblond J.-M., 1974, Riv. Nuovo Cimento 4, 99. L evy-Leblond J.-M., 1976, Am. J. Phys. 44, 11. Mattle K., H. Weinfurter, P. G. Kwiat and A. Zeilinger, 1996, Phys. Rev. Lett. 76, 4556-4659. Mermin N. D., 1990, Am, J. Phys. 58, 58. Mermin N. D., 1998(a), Pramana 51, 549. See Los Alamos e-print archive quant-ph/9609013. Mermin N. D., 1998(b), Am. J. Phys. 66, 753. See Los Alamos e-print archive quant-ph/980105. Mermin N. D., 1998(c), See Los Alamos e-print archive quant-ph/9807055. Mittelstaedt P., A. Prieur and R. Schieder, 1987, Found. Phys. 17, 891. Pan J.-W., D. Bouwmeester, H. Weinfurter and A. Zeilinger, 1998, Phys. Rev
131
Peres A., 1996, Quantum Theory: Concepts and Methods (Kluwer Academic Publishers). Pauli W., 1955, in Writings on Philosophy and Physics edited by C. P. Enz and K. von Meyenn, translated by Robert Schlapp (Springer Verlag, Berlin). Pauli W., 1958 (the rst edition), 1990 (new edition), Die allgemeinen Prinzipien der Wellenmechanik in Handbuch der Physik, Band V, 1 (Hrsg. S. Flgge, Springer-Verlag). Rarity J. G. and P. R. Tapster, 1990, Phys. Rev. Lett. 64, 2495. Rauch H. and J. Summhammer, 1984, Phys. Lett. 104A,44. R anyi A., 1962 Wahrscheinlichkeitsrechnung mit einem Anhang u ber Informationstheorie (Deutscher Verlag der Wissenschaft). Shannon C. E., 1948, Bell Syst. Tech. J. 27, 379. A copy can be found at the web site http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html Schr odinger E., 1935, Naturwissenschaften 23, 807. Translation published in Proc. Am. Phil. Soc. 124, 323 and in Quantum Theory and Measurement edited by J. A. Wheeler and W. H. Zurek, (Princeton University Press, New Jersay, 1983). A copy of the translation can be found at the web site www.emr.hibu.no/lars/eng/cat Scully M. O., 1991, B. G. Englert and H. Walther, Nature 351, 111. Summhammer J., 1988, Found. Phys. Lett. 1, 123. Summhamer J., 1994, Int. J. Theor. Phys. 33, 171. Summhammer J., H. Rauch and D. Tuppinger, 1982, Phys. Rev. A 36, 4447. Unk J., 1990, PhD Thesis: Measures of Uncertainty and the Uncertainty Principle (R. U. Utrecht). von Weizs acker C. F., 1985, Aufbau der Physik (Carl Hanser, M unchen). von Weizs acker C. F., 1975, in Quantum Theory and the Structure of Time and Space II, edited by L. Castell, M. Drieschner and C. F. von Weizs acker (Hanser,
132 M unchen).
References
Wheeler J. A., 1983, Law without Law in Quantum Theory and Measurement edited by J. A. Wheeler and W. H. Zurek, (Princeton University Press, Princeton) 182. Wheeler J. A., 1989, Prooc. 3rd Int. Symp. Foundations of Quantum Mechanics, Tokyo, 354. Weinberg S., 1989(a), Ann. Phys. (NY) 194, 336. Weinberg S. 1989(b), Phys. Rev. Lett. 62, 485. Wootters W. K., 1981, Phys. Rev D 23, 357. Wootters W. K. and B. D. Fields, 1989, Ann. Phys. 191, 363. Wootters W. K. and W. H. Zurek, 1982, Nature (London), 299, 802. Wootters W. K. and W. H. Zurek, 1979, Phys. Rev. D 19, 473. Zeilinger A., 1986, Physica 137B, 235. Zeilinger A., 1990, in Quantum Theory without Reduction, edited by M Cini and J.-M. L evy-Leblond (Adam Hilger, Bristol and New York), 9. Zeilinger A., 1996, in Vastakohtien todellisuus, Festschrift for K. V. Laurikainen, edited by U. Ketvel et al. (Helsinki University Press). A copy can be found at the web site www.quantum.at. Zeilinger A., 1997, Phil. Trans. R. Soc. Lond. A, 355, 2401. Zeilinger A., 1999, Found. Phys., 29, 631. Zukowski M. A. Zeilinger, M. A. Horne and A. K. Ekert, 1993, Phys. Rev. Lett. 71, 4287.
Acknowledgments
No one deserves more thanks for the success of this work than my advisor Prof. Anton Zeilinger. Not only his openness for unusual views of interpreting and understanding quantum phenomena, and his deep physical insight that often brings enlightenment even where any calculation failed, but also his stability support and encouragement throughout years of our acquaintance and his condence that students could carry out a signicant contribution to their chosen area of physics were and are very crucial in forming my own way of thought, both in the scientic and the secular. In a perceived declining interest for foundational questions of quantum mechanics I am grateful for having the opportunity to write this thesis under the guidance of Prof. Anton Zeilinger. I would like to thank to Prof. Johann Summhammer. His inuence on this dissertation was from a distance, but not less great because of that. Much of the viewpoints espoused here were worked out in conversation with him. I wish to thank the following: Christoph Simon (for fruitfully discussions about physics and life and critically listening of many of the ideas presented here); se nor Matthew Daniell Malus (for correcting the english in the thesis); Olaf Nairz (for corecting the german in Zusammenfassung); Christine G otschObmascher (for her continuous help on various matters throughout the years); Prof. Ba si c, my rst physics teacher (for initiating my love to physics); Prof. Fedor Herbut, my professor of quantum mechanics in Belgrad (for teaching me that the rst step in understanding quantum mechanics is in realizing the size of its nonunderstability, and together with Prof. Milan Vuji ci c for supporting my decision that nally brought me to continue my studies of physics in Vienna); my mother Olga, brother Ivan, family Radak, Tetka and Vlada (for supporting me during my studies). I gratefully acknowledge the nancial support of Austrian Fond zur F orderung der Wissenschaftlichen Forschung (Projects No. S6502 and F1506) during the research of the thesis. 133
134 Finally, I thank my wife Zorica for supporting me and turning my downs into ups throughout the years, and daughter Isidora and son Sergej whose joint existence coming in the middle of writing of the thesis gave me additional purpose and strength to see this dissertation through.
Lebenslauf
09.07.1967 19741982 19821986 19861987 19871991 19921995 1993 13.07.1995 Geboren in Novi Sad/Jugoslawien als zweiter Sohn von Olga und Bogdan Brukner Besuch der Grundschule in Novi Sad Ausbildung am Mathematischen Gymnasium - Fachrichtung Nuklearphysik in Belgrad/Jugoslawien Milit ardienst Studium der Physik an der Naturwissenschaftlichen Fakult at der Universit at Belgrad/Jugoslawien Fortsetzung des Studiums der Physik an der Formal und Natur wissenschaftlichen Fakult at der Universit at Wien/Osterreich T atigkeit im Rahmen des Projekts: Literature Search in Process Simulation bei der Digital Equipment Corporation Wien Sponsion zum Magister der Naturwissenschaften. Diplomarbeit: Beugung von Materiewellen im Raum und in der Zeit durchgef uhrt bei Univ. Prof. Dr. Anton Zeilinger Heirat mit Zorica Mitrovi c Doktorstudium der Technischen Physik an der Technischen Universit at Wien bei Univ. Prof. Dr. Anton Zeilinger Geburt von Tochter Isidora und Sohn Sergej
135