Quantum theory of observation/Quantum theory for beginners

This chapter is for the beginner. It can of course be omitted by a reader who already knows a little quantum physics.

The great principle : the existence of quantum superpositions

Quantum physics can be summarized by a single great principle, the principle of superposition of states, whose meaning is difficult to understand:

Any physical system which can be in the states $|1\rangle$ and $|2\rangle$ can also be in a state $\alpha |1\rangle +\beta |2\rangle$ where $\alpha$ and $\beta$ are any complex numbers.

The same principle can be phrased in an equivalent way :

The space of states of any physical system is a complex vector space.

As far as we know the validity of the superposition principle is not restricted : any physical system. There is not any boundary between quantum systems, which obey the superposition principle, and classical systems, which would not. All known systems are fundamentally quantum systems, because they are all made of quantum particles.

There is something crazy in this universal validity of quantum superposition. Suppose that $|1\rangle$ and $|2\rangle$ are states of the moon in two different places. If the moon is in the state ${\frac {1}{\sqrt {2}}}(|1\rangle +|2\rangle$ ), it seems to be in two different places at the same time. This should be a general phenomenon. With quantum superposition, any system can be simultaneously in as many places as one wants. Could Don Juan multiply his affairs in a quantum way ?

The superposition principle can not be applied to the Don Juan case, or not in a direct and simple way, but the reasons for this are difficult to understand. Why is the moon in a definite place on its orbit ? Why is not it evenly distributed in the sky ? (cf. 4.6, 4.18 and 4.19) It is called the Schrödinger's cat problem: can a cat be alive and dead at the same time? (Schrödinger 1935)

The universal validity of the superposition principle can be illustrated with many examples : wave-particle duality and light polarization are very direct applications. Physical explanations which depend on the superposition principle are incredibly numerous : properties of elementary particles, stability of atoms, of molecules and of materials, radioactivity, existence of metals, of semi-conductors and of insulating materials, superconductivity, superfluidity, lasers ... The superposition principle explains in a unified way all these diverse phenomena.

Wave-particle duality

Is light a flow of particles or a wave phenomenon ? Light rays could be particle paths and they were regarded thus by Newton in his Optics. Light reflection in a mirror is then naturally interpreted with the hypothesis that particles of light, or photons, are like bouncing balls. Nevertheless Huyghens argued that this phenomenon and others were better interpreted with the hypothesis that light rays are perpendicular lines to wave fronts.

Photography gives an evidence of the existence of particles of light, for traces left by light are always like impacts of particles.

But if light is made of particles how can we explain interference patterns such as those found by Young and Fresnel ? Interference is always interference between waves. It seems there can not be any interference with particles. An interference pattern is an experimental evidence that light is a wave phenomenon. It is confirmed by Maxwell's theory of electromagnetism, which defines light as an electromagnetic wave.

That light be made of particles is not contradicted by the existence of interference patterns. Here is what we can see if we look at how an interference pattern appears on a photographic plate :

Real double-slit experiment with electrons. Individual particles appear on the detector, slowly filling in the interference pattern.
Simulated double-slit experiment.

The wave phenomenon, interference, results from impacts of particles.

The superposition principle gives a very direct explanation of wave-particle duality. Any physical system is a particle or a system of particles, but these behave sometimes like waves because they can be in many places at the same time. The wave of a particle or of a system of particles determines its diffuse presence.

For example the state of a particle can be a wave packet :

This solitary wavelet represents the movement of a particle. The particle can be detected only where its wave is not zero, in the coloured regions.

Because its state is identified with a wave a particle can interfere with itself.

This is why interference patterns are obtained with particles.

In 1923, de Broglie generalised the wave-particle duality of light to electrons. With this hypothesis he confirmed Bohr's constraints on the planetary model of the atom, supported by experimental evidence of the spectrum of hydrogen. De Broglie's hypothesis was later confirmed in 1927 by interference patterns of electrons (Davisson & Germer). In 1925, Schrödinger discovered a wave equation for a model of the atom, Heisenberg a matrix equation, applied by Pauli to the same model, which all give a new theoretical explanation of the hydrogen spectrum. Dirac then proved that Heisenberg and Schrödinger's formalisms were equivalent - or almost equivalent - and from them he gave general principles for quantum mechanics (Dirac 1930). On these foundations, with the superposition principle in the front line, all quantum physics could then be developed.

The polarization of light

Experiments on light polarization can be cheap, simple and give a direct interpretation of the main quantum principle.

The introduction of a filter enhances transparency. This could be called an antifilter.

This experiment can be interpreted with a simplified model. The state space of a photon is a two-dimensional complex vector space. $|\leftrightarrow \rangle$ and $|\updownarrow \rangle$ are two basis vectors of this space. These two states of polarization have a definite experimental meaning. Ordinary light is not polarized, that is, all photons can be in different states of polarization. But light transmitted by a polarizer - like sunglasses - is always polarized, that is, all photons are in the same state of polarization. If the polarizer is oriented in some determined direction then transmitted photons are all in the state $|\leftrightarrow \rangle$ . If it is rotated of 90 degrees then transmitted photons are all in the state $|\updownarrow \rangle$ . If it is rotated of 45 degrees then transmitted photons are all in the state ${\frac {1}{\sqrt {2}}}(|\leftrightarrow \rangle +|\updownarrow \rangle )$ .

$|\leftrightarrow \rangle$ and $|\updownarrow \rangle$ are orthogonal states. The meaning of orthogonal here is not only geometrical but also quantum mechanical, that is, if a photon is in the state $|\leftrightarrow \rangle$ , it can not be detected in the state $|\updownarrow \rangle$ , and conversely. This means here that when two perpendicular polarizers are associated, light can not be transmitted, for all light transmitted by the first one is blocked by the second. This can be seen on the picture, where the black part shows the association of two perpendicular polarizers.

$|\leftrightarrow \rangle$ and ${\frac {1}{\sqrt {2}}}(|\leftrightarrow \rangle +|\updownarrow \rangle )$ are not orthogonal, neither are $|\updownarrow \rangle$ and ${\frac {1}{\sqrt {2}}}(|\leftrightarrow \rangle +|\updownarrow \rangle )$ . If a photon is in the state $|\leftrightarrow \rangle$ , it has a probability 1/2 to be detected in the state ${\frac {1}{\sqrt {2}}}(|\leftrightarrow \rangle +|\updownarrow \rangle )$ , and conversely. The consequence of this is the existence of an antifilter. If a polarizer is introduced between two perpendicular polarizers, with an angle of 45 degrees, then all photons transmitted by the first polarizer are in the state $|\leftrightarrow \rangle$ , half of them - in the ideal case of a perfect polarizer - are transmitted by the second, and are then in the state ${\frac {1}{\sqrt {2}}}(|\leftrightarrow \rangle +|\updownarrow \rangle )$ . Half of these latter photons are then transmitted by the third polarizer, that is, a quarter of the original $|\leftrightarrow \rangle$ photons, whereas no photons would have been transmitted without the intermediate "antifilter". This antifilter effect can be clearly seen on the picture.

What is a complex number ?

To construct complex numbers we consider the rotations around a point $O$ in a plane.

We call $i$ the rotation of a quarter of a turn in a counterclockwise sense, which mathematicians call direct because that is the usual convention.

If $r_{1}$ and $r_{2}$ are two rotations, we note $r_{2}r_{1}$ the rotation obtained by first doing $r_{1}$ and then $r_{2}$ . This is the usual convention because $g(f(x))=(g\circ f)(x)=(gf)(x)$ is the image of $x$ by $f$ followed by $g$ . We note $r^{2}=rr$

We note $1$ the absence of displacement. Thus $r1=1r=r$ for any rotation r. The rotation of a half-turn is noted by $-1$ because two half-turns make one turn and $(-1)(-1)=1$ . So we have :

$i^{2}=-1$

It only means that two quarters of a turn make a half-turn.

Since the squares of ordinary numbers are always positive, one can not identify $i$ with any of them. But this does not prevent calculation with $i$ as if it were an ordinary number. Since ordinary numbers are called real numbers, it is said of $i$ that it is an imaginary number.

To complete the construction we add to the rotations the enlargements and the constrictions, the zooms, which are called homotheties. The homothety of center $O$ and scale factor $\rho$ is the zoom centered on $O$ with scale factor $\rho$ . It is an enlargement if $\rho >1$ , a shrinkage if $0\leq \rho <1$ . The homothety with scale factor $1$ is an absence of displacement, it is therefore identified with the number $1$ .

The rotations and the homotheties of the same center commute. This means that the result of a succession of operations does not depend on their order. $z_{1}z_{2}=z_{2}z_{1}$ for all $z_{1}$ and $z_{2}$ . By combining several rotations and several homotheties one always obtains the same result as with a single rotation and a single homothety. The set of these transformations of the plane is called the set of complex numbers. The composition of the transformations defines the product of these numbers.

A complex number is determined only by two real numbers. One is the argument, it is the rotation angle between $0$ and $2\pi =1$ turn $=360$ degrees. The other is the modulus, it is the scale factor of homothety, always positive.

For an orthonormal coordinate system centered on $O$ , let $A$ be the point whose coordinates are $(1,0)$ . Each complex number can be associated with the image of $A$ by the transformation of the plane it defines. In this way each complex number is associated with a single point of the plane, and each point of the plane is associated with a single complex number. It is said that there is a bijection between the points of the plane and the complex numbers. Each complex number can therefore be identified by the coordinates of the point of the plane to which it is associated. For example, $1$ is associated with the point $A$ and hence has $(1,0)$ for coordinates, the coordinates of $i$ are $(0,1)$ and those of the complex number $0$ are $(0,0)$ .

The first coordinate of a complex number is called its real part, the second, its imaginary part. We call purely imaginary a complex number whose real part is zero. $i$ is purely imaginary.

The set of complex numbers can be equipped with the addition operation by defining it by adding the coordinates. We then have $z=a+ib$ for a complex number whose real part is $a$ and whose imaginary part $b$ .

We say of the set of complex numbers thus constructed, equipped with the operations of addition and multiplication, that it is a field. This only means that one can calculate with complex numbers as with ordinary numbers.

From the definition of the complex exponential, one can prove that the complex number of modulus $\rho$ and argument $\phi$ is equal to $\rho e^{i\phi }$ , hence we have:

e^{i\phi }=cos\phi +isin\phi

In particular :

e^{i\pi }=-1

This Euler formula is regarded as one of the prettiest of mathematics because it simply connects four of the most important numbers.

Why is quantum reality represented by complex numbers ?

Classical physics declares several principles of superposition: superposition of waves, forces, distributions of probability ... But none makes use of complex numbers. The role of complex numbers in classical physics is reduced. They are especially useful for studying sinusoidal functions, but they do not play a fundamental role. According to classical physics, the quantities which describe reality are always real numbers.

The beginner is tempted to interpret the quantum superpositions as distributions of probabilities. But this is an impasse, because the probability distributions are defined with real numbers.

Complex numbers make quantum states more than ordinary distributions of probabilities. They are essential to the quantum way of being. Why is this so ? Nobody knows.

Scalar product and unitary operators

For plane geometry, the scalar product of two vectors $u=(u_{1},u_{2})$ and $v=(v_{1},v_{2})$ is $\langle u,v\rangle =u_{1}v_{1}+u_{2}v_{2}$ , which is generalized to $\langle u,v\rangle =\sum _{i}u_{i}v_{i}$ in spaces of higher dimension. The scalar product of a vector $v$ by itself is the square of its length $|v|$ : ${v_{1}}^{2}+{v_{2}}^{2}=|v|^{2}$ . It is simply the theorem of Pythagoras.

When vectors are defined with complex numbers, their scalar product is defined by:

$\langle u|v\rangle =\sum _{i}{u_{i}}^{*}v_{i}$

where $z^{*}$ is the complex conjugate of $z$ . It is defined by $z^{*}=a-ib$ for $z=a+ib$ . The conjugation operation in the plane of the complex numbers is the reflection with respect to the horizontal axis.

A transformation $T$ of the plane or of the space conserves the lengths when it conserves the scalar product:

$\langle T(u),T(v)\rangle =\langle u,v\rangle$

It is then called an isometry. It is a rotation, a reflection, or a combination of them.

If $T$ is a transformation of a complex vector space and if it conserves the scalar product, it is called a unitary operator. The quantum superpositions do not define probability distributions, which are real numbers, but probability amplitudes, which are complex numbers. A probability is calculated by taking the squared modulus of a probability amplitude. These probabilities are attributed to all possible outcomes of an experiment. It is therefore necessary that their sum should be equal to one. This is why quantum states are always identified with vectors of length one when calculating probabilities.

The evolution operators which describe the state changes between two successive instants must not change the length of the state vectors so that their components can be interpreted as probability amplitudes. The principle of unitary evolution (cf. 2.1, second principle) imposes unitary evolution operators and thus guarantees the possibility of a probabilistic interpretation.

A unitary operator $U$ is linear:

$U(\alpha u+\beta v)=\alpha U(u)+\beta U(v)$

This is one of the most important formulas of quantum physics (cf. 2.3).

Physicists have often become accustomed to noting $|v\rangle$ the state vector $v$ . This is Dirac's notation. With the dual notation $\langle v|$ it enables to do very conveniently linear algebra, and to calculate rightly even if one understands nothing. It is sometimes misleading. This is why it is rejected by certain physicists (Peres 1995, Weinberg 2012). It is used throughout this book.

Tensor product and entanglement

The $|a_{i}\rangle$ is a basis of the space $H_{A}$ of the states of A, the $|b_{j}\rangle$ a basis of the space $H_{B}$ of the states of B.

The couples $(|a_{i}\rangle ,|b_{j}\rangle )$ , noted $|a_{i}\rangle \otimes |b_{j}\rangle$ , can be considered as vectors of a new vector space, denoted by $H_{A}\otimes H_{B}$ , the tensor product of $H_{A}$ and $H_{B}$ . It can be constructed simply by taking all $|a_{i}\rangle \otimes |b_{j}\rangle$ as basis states. It does not depend on the choice of the bases $|a_{i}\rangle$ and $|b_{j}\rangle$ .

If $|u\rangle =\sum _{i}u_{i}|a_{i}\rangle$ and $|v\rangle =\sum _{j}v_{j}|b_{j}\rangle$ are vectors in $H_{A}<Math>and<math>H_{B}$ , their tensor product is defined by:

$|u\rangle \otimes |v\rangle =(\sum _{i}u_{i}|a_{i}\rangle )\otimes (\sum _{j}v_{j}|b_{j}\rangle )=\sum _{ij}u_{i}v_{j}(|a_{i}\rangle \otimes |b_{j}\rangle )$

$|u\rangle \otimes |v\rangle$ is a separable vector. It assigns a single vector $|u\rangle$ to A and a single vector $|v\rangle$ to B. The vectors in $H_{A}\otimes H_{B}$ are not always separable, because in general the addition of two separable vectors is not a separable vector. The inseparable states, which are also called entangled, are of fundamental importance in quantum physics (see chapter 4).

The qubits

The simplest quantum state space is the state space of a qubit. It is a space of dimension two. If its dimension were one, a quantum being could not evolve, there would be no movement therefore no physics.

All state spaces of all quantum systems can be constructed from finite-dimensional spaces, passing to the limit if we want them to be of infinite dimension, and especially from the simplest of them, the state space of a qubit.

Following chapter >>