# This Quantum World/print version

version 2019–12–6 of
This Quantum World

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
//en.wikibooks.org/wiki/This_Quantum_World

# Atoms

## What does an atom look like?

### Or like this?

None of these images depicts an atom as it is. This is because it is impossible to even visualize an atom as it is. Whereas the best you can do with the images in the first row is to erase them from your memory—they represent a way of viewing the atom that is too simplified for the way we want to start thinking about it—the eight fuzzy images in the next row deserve scrutiny. Each represents an aspect of a stationary state of atomic hydrogen. You see neither the nucleus (a proton) nor the electron. What you see is a fuzzy position. To be precise, what you see are cloud-like blurs, which are symmetrical about the vertical and horizontal axes, and which represent the atom's internal relative position—the position of the electron relative to the proton or the position of the proton relative to the electron.

• What is the state of an atom?
• What is a stationary state?
• What exactly is a fuzzy position?
• How does such a blur represent the atom's internal relative position?
• Why can we not describe the atom's internal relative position as it is?

## Quantum states

In quantum mechanics, states are probability algorithms. We use them to calculate the probabilities of the possible outcomes of measurements on the basis of actual measurement outcomes. A quantum state takes as its input

• one or several measurement outcomes,
• a measurement M,
• the time of M,

and it yields as its output the probabilities of the possible outcomes of M.

A quantum state is called stationary if the probabilities it assigns are independent of the time of the measurement.

From the mathematical point of view, each blur represents a density function ${\displaystyle \rho ({\boldsymbol {r}})}$. Imagine a small region ${\displaystyle R}$ like the little box inside the first blur. And suppose that this is a region of the (mathematical) space of positions relative to the proton. If you integrate ${\displaystyle \rho ({\boldsymbol {r}})}$ over ${\displaystyle R,}$ you obtain the probability ${\displaystyle p\,(R)}$ of finding the electron in ${\displaystyle R,}$ provided that the appropriate measurement is made:

${\displaystyle p\,(R)=\int _{R}\rho ({\boldsymbol {r}})\,d^{3}{\boldsymbol {r}}.}$

"Appropriate" here means capable of ascertaining the truth value of the proposition "the electron is in ${\displaystyle R}$", the possible truth values being "true" or "false". What we see in each of the following images is a surface of constant probability density.

Now imagine that the appropriate measurement is made. Before the measurement, the electron is neither inside ${\displaystyle R}$ nor outside ${\displaystyle R}$. If it were inside, the probability of finding it outside would be zero, and if it were outside, the probability of finding it inside would be zero. After the measurement, on the other hand, the electron is either inside or outside ${\displaystyle R.}$

Conclusions:

• Before the measurement, the proposition "the electron is in ${\displaystyle R}$" is neither true nor false; it lacks a (definite) truth value.
• A measurement generally changes the state of the system on which it is performed.

As mentioned before, probabilities are assigned not only to measurement outcomes but also on the basis of measurement outcomes. Each density function ${\displaystyle \rho _{nlm}}$ serves to assign probabilities to the possible outcomes of a measurement of the electron's position relative to the proton. And in each case the assignment is based on the outcomes of a simultaneous measurement of three observables: the atom's energy (specified by the value of the principal quantum number ${\displaystyle n}$), its total angular momentum ${\displaystyle l}$ (specified by a letter, here p, d, or f), and the vertical component of its angular momentum ${\displaystyle m}$.

## Fuzzy observables

We say that an observable ${\displaystyle Q}$ with a finite or countable number of possible values ${\displaystyle q_{k}}$ is fuzzy (or that it has a fuzzy value) if and only if at least one of the propositions "The value of ${\displaystyle Q}$ is ${\displaystyle q_{k}}$" lacks a truth value. This is equivalent to the following necessary and sufficient condition: the probability assigned to at least one of the values ${\displaystyle q_{k}}$ is neither 0 nor 1.

What about observables that are generally described as continuous, like a position?

The description of an observable as "continuous" is potentially misleading. For one thing, we cannot separate an observable and its possible values from a measurement and its possible outcomes, and a measurement with an uncountable set of possible outcomes is not even in principle possible. For another, there is not a single observable called "position". Different partitions of space define different position measurements with different sets of possible outcomes.

• Corollary: The possible outcomes of a position measurement (or the possible values of a position observable) are defined by a partition of space. They make up a finite or countable set of regions of space. An exact position is therefore neither a possible measurement outcome nor a possible value of a position observable.

So how do those cloud-like blurs represent the electron's fuzzy position relative to the proton? Strictly speaking, they graphically represent probability densities in the mathematical space of exact relative positions, rather than fuzzy positions. It is these probability densities that represent fuzzy positions by allowing us to calculate the probability of every possible value of every position observable.

It should now be clear why we cannot describe the atom's internal relative position as it is. To describe a fuzzy observable is to assign probabilities to the possible outcomes of a measurement. But a description that rests on the assumption that a measurement is made, does not describe an observable as it is (by itself, regardless of measurements).

## Planck

Quantum mechanics began as a desperate measure to get around some spectacular failures of what subsequently came to be known as classical physics.

In 1900 Max Planck discovered a law that perfectly describes the spectrum of a glowing hot object. Planck's radiation formula turned out to be irreconcilable with the physics of his time. (If classical physics were right, you would be blinded by ultraviolet light if you looked at the burner of a stove, aka the UV catastrophe.) At first, it was just a fit to the data, "a fortuitous guess at an interpolation formula" as Planck himself called it. Only weeks later did it turn out to imply the quantization of energy for the emission of electromagnetic radiation: the energy ${\displaystyle ~E~}$ of a quantum of radiation is proportional to the frequency ${\displaystyle \nu }$ of the radiation, the constant of proportionality being Planck's constant ${\displaystyle ~h~}$

${\displaystyle ~E=h\nu ~}$.

We can of course use the angular frequency ${\displaystyle ~\omega =2\pi \nu ~}$ instead of ${\displaystyle \nu }$. Introducing the reduced Planck constant ${\displaystyle ~\hbar =h/2\pi ~}$, we then have

${\displaystyle E=\hbar \omega }$.

This theory is valid at all temperatures and helpful in explaining radiation by black bodies.

## Rutherford

In 1911 Ernest Rutherford proposed a model of the atom based on experiments by Geiger and Marsden. Geiger and Marsden had directed a beam of alpha particles at a thin gold foil. Most of the particles passed the foil more or less as expected, but about one in 8000 bounced back as if it had encountered a much heavier object. In Rutherford's own words this was as incredible as if you fired a 15 inch cannon ball at a piece of tissue paper and it came back and hit you. After analysing the data collected by Geiger and Marsden, Rutherford concluded that the diameter of the atomic nucleus (which contains over 99.9% of the atom's mass) was less than 0.01% of the diameter of the entire atom. He suggested that the atom is spherical in shape and the atomic electrons orbit the nucleus much like planets orbit a star. He calculated mass of electron as 1/7000th part of the mass of an alpha particle. Rutherford's atomic model is also called the Nuclear model.

The problem of having electrons orbit the nucleus the same way that a planet orbits a star is that classical electromagnetic theory demands that an orbiting electron will radiate away its energy and spiral into the nucleus in about 0.5×10-10 seconds. This was the worst quantitative failure in the history of physics, under-predicting the lifetime of hydrogen by at least forty orders of magnitude! (This figure is based on the experimentally established lower bound on the proton's lifetime.)

## Bohr

In 1913 Niels Bohr postulated that the angular momentum ${\displaystyle L}$ of an orbiting atomic electron was quantized: its "allowed" values are integral multiples of ${\displaystyle \hbar }$:

${\displaystyle L=n\hbar }$ where ${\displaystyle n=1,2,3,\dots }$

Why quantize angular momentum, rather than any other quantity?

• Radiation energy of a given frequency is quantized in multiples of Planck's constant.
• Planck's constant is measured in the same units as angular momentum.

Bohr's postulate explained not only the stability of atoms but also why the emission and absorption of electromagnetic radiation by atoms is discrete. In addition it enabled him to calculate with remarkable accuracy the spectrum of atomic hydrogen — the frequencies at which it is able to emit and absorb light (visible as well as infrared and ultraviolet). The following image shows the visible emission spectrum of atomic hydrogen, which contains four lines of the Balmer series.

Visible emission spectrum of atomic hydrogen, containing four lines of the Balmer series.

Apart from his quantization postulate, Bohr's reasoning at this point remained completely classical. Let's assume with Bohr that the electron's orbit is a circle of radius ${\displaystyle r.}$ The speed of the electron is then given by ${\displaystyle v=r\,d\beta /dt,}$ and the magnitude of its acceleration by ${\displaystyle a=dv/dt=v\,d\beta /dt.}$ Eliminating ${\displaystyle d\beta /dt}$ yields ${\displaystyle a=v^{2}/r.}$ In the cgs system of units, the magnitude of the Coulomb force is simply ${\displaystyle F=e^{2}/r^{2},}$ where ${\displaystyle e}$ is the magnitude of the charge of both the electron and the proton. Via Newton's ${\displaystyle F=ma}$ the last two equations yield ${\displaystyle m_{e}v^{2}=e^{2}/r,}$ where ${\displaystyle m_{e}}$ is the electron's mass. If we take the proton to be at rest, we obtain ${\displaystyle T=m_{e}v^{2}/2=e^{2}/2r}$ for the electron's kinetic energy.

If the electron's potential energy at infinity is set to 0, then its potential energy ${\displaystyle V}$ at a distance ${\displaystyle r}$ from the proton is minus the work required to move it from ${\displaystyle r}$ to infinity,

${\displaystyle V=-\int _{r}^{\infty }F(r')\,dr'=-\int _{r}^{\infty }\!{e^{2} \over (r')^{2}}\,dr'=+\left[{e^{2} \over r'}\right]_{r}^{\infty }=0-{e^{2} \over r}.}$

The total energy of the electron thus is

${\displaystyle E=T+V=e^{2}/2r-e^{2}/r=-e^{2}/2r.}$

We want to express this in terms of the electron's angular momentum ${\displaystyle L=m_{e}vr.}$ Remembering that ${\displaystyle m_{e}v^{2}=e^{2}/r,}$ and hence ${\displaystyle rm_{e}^{2}v^{2}=m_{e}e^{2},}$ and multiplying the numerator ${\displaystyle e^{2}\,}$ by ${\displaystyle m_{e}e^{2}}$ and the denominator ${\displaystyle 2r}$ by ${\displaystyle rm_{e}^{2}v^{2},}$ we obtain

${\displaystyle E=-{e^{2} \over 2r}=-{m_{e}e^{4} \over 2m_{e}^{2}v^{2}r^{2}}=-{m_{e}e^{4} \over 2L^{2}}.}$

Now comes Bohr's break with classical physics: he simply replaced ${\displaystyle L}$ by ${\displaystyle n\hbar }$. The "allowed" values for the angular momentum define a series of allowed values for the atom's energy:

${\displaystyle E_{n}=-{1 \over n^{2}}\left({m_{e}e^{4} \over 2\hbar ^{2}}\right),\quad n=1,2,3,\dots }$

As a result, the atom can emit or absorb energy only by amounts equal to the absolute values of the differences

${\displaystyle \Delta E_{nm}=E_{n}-E_{m}=\left({1 \over n^{2}}-{1 \over m^{2}}\right)\,{\hbox{Ry}},}$

one Rydberg (Ry) being equal to ${\displaystyle m_{e}e^{4}/2\hbar ^{2}=13.6056923(12)\,{\hbox{eV.}}}$ This is also the ionization energy ${\displaystyle \Delta E_{1\infty }}$ of atomic hydrogen — the energy needed to completely remove the electron from the proton. Bohr's predicted value was found to be in excellent agreement with the measured value.

Using two of the above expressions for the atom's energy and solving for ${\displaystyle r,}$ we obtain ${\displaystyle r=n^{2}\hbar ^{2}/m_{e}e^{2}.}$ For the ground state ${\displaystyle (n=1)}$ this is the Bohr radius of the hydrogen atom, which equals ${\displaystyle \hbar ^{2}/m_{e}e^{2}=5.291772108(18)\times 10^{-11}m.}$ The mature theory yields the same figure but interprets it as the most likely distance from the proton at which the electron would be found if its distance from the proton were measured.

## de Broglie

In 1923, ten years after Bohr had derived the spectrum of atomic hydrogen by postulating the quantization of angular momentum, Louis de Broglie hit on an explanation of why the atom's angular momentum comes in multiples of ${\displaystyle \hbar .}$ Since 1905, Einstein had argued that electromagnetic radiation itself was quantized (and not merely its emission and absorption, as Planck held). If electromagnetic waves can behave like particles (now known as photons), de Broglie reasoned, why cannot electrons behave like waves?

Suppose that the electron in a hydrogen atom is a standing wave on what has so far been thought of as the electron's circular orbit. (The crests, troughs, and nodes of a standing wave are stationary.) For such a wave to exist on a circle, the circumference of the latter must be an integral multiple of the wavelength ${\displaystyle \lambda }$ of the former: ${\displaystyle 2\pi r=n\lambda .}$

Einstein had established not only that electromagnetic radiation of frequency ${\displaystyle \nu }$ comes in quanta of energy ${\displaystyle E=h\nu }$ but also that these quanta carry a momentum ${\displaystyle p=h/\lambda .}$ Using this formula to eliminate ${\displaystyle \lambda }$ from the condition ${\displaystyle 2\pi r=n\lambda ,}$ one obtains ${\displaystyle pr=n\hbar .}$ But ${\displaystyle pr=mvr}$ is just the angular momentum ${\displaystyle L}$ of a classical electron with an orbit of radius ${\displaystyle r.}$ In this way de Broglie derived the condition ${\displaystyle L=n\hbar }$ that Bohr had simply postulated.

## Schrödinger

If the electron is a standing wave, why should it be confined to a circle? After de Broglie's crucial insight that particles are waves of some sort, it took less than three years for the mature quantum theory to be found, not once, but twice. By Werner Heisenberg in 1925 and by Erwin Schrödinger in 1926. If we let the electron be a standing wave in three dimensions, we have all it takes to arrive at the Schrödinger equation, which is at the heart of the mature theory.

Let's keep to one spatial dimension. The simplest mathematical description of a wave of angular wavenumber ${\displaystyle k=2\pi /\lambda }$ and angular frequency ${\displaystyle \omega =2\pi /T=2\pi \nu }$ (at any rate, if you are familiar with complex numbers) is the function

${\displaystyle \psi (x,t)=e^{i(kx-\omega t)}.}$

Let's express the phase ${\displaystyle \phi (x,t)=kx-\omega t}$ in terms of the electron's energy ${\displaystyle E=h\nu =\hbar \omega }$ and momentum ${\displaystyle p=h/\lambda =\hbar k:}$

${\displaystyle \psi (x,t)=e^{i(px-Et)/\hbar }.}$

The partial derivatives with respect to ${\displaystyle x}$ and ${\displaystyle t}$ are

${\displaystyle {\partial \psi \over \partial x}={i \over \hbar }p\psi \quad {\hbox{and}}\quad {\partial \psi \over \partial t}=-{i \over \hbar }E\psi .}$

We also need the second partial derivative of ${\displaystyle \psi }$ with respect to ${\displaystyle x}$:

${\displaystyle {\partial ^{2}\psi \over \partial x^{2}}=\left({ip \over \hbar }\right)^{2}\psi .}$

We thus have

${\displaystyle E\psi =i\hbar {\partial \psi \over \partial t},\quad p\psi =-i\hbar {\partial \psi \over \partial x},\quad {\hbox{and}}\quad p^{2}\psi =-\hbar ^{2}{\partial ^{2}\psi \over \partial x^{2}}.}$

In non-relativistic classical physics the kinetic energy and the kinetic momentum ${\displaystyle p}$ of a free particle are related via the dispersion relation

${\displaystyle E=p^{2}/2m.}$

This relation also holds in non-relativistic quantum physics. Later you will learn why.

In three spatial dimensions, ${\displaystyle p}$ is the magnitude of a vector ${\displaystyle {\textbf {p}}}$. If the particle also has a potential energy ${\displaystyle V({\textbf {r}},t)}$ and a potential momentum ${\displaystyle {\textbf {A}}({\textbf {r}},t)}$ (in which case it is not free), and if ${\displaystyle E}$ and ${\displaystyle {\textbf {p}}}$ stand for the particle's total energy and total momentum, respectively, then the dispersion relation is

${\displaystyle E-V=({\textbf {p}}-{\textbf {A}})^{2}/2m.}$

By the square of a vector ${\displaystyle {\textbf {v}}}$ we mean the dot (or scalar) product ${\displaystyle {\textbf {v}}\cdot {\textbf {v}}}$. Later you will learn why we represent possible influences on the motion of a particle by such fields as ${\displaystyle V({\textbf {r}},t)}$ and ${\displaystyle {\textbf {A}}({\textbf {r}},t).}$

Returning to our fictitious world with only one spatial dimension, allowing for a potential energy ${\displaystyle V(x,t)}$, substituting the differential operators ${\displaystyle i\hbar {\partial \over \partial t}}$ and ${\displaystyle -\hbar ^{2}{\partial ^{2} \over \partial x^{2}}}$ for ${\displaystyle E}$ and ${\displaystyle p^{2}}$ in the resulting dispersion relation, and applying both sides of the resulting operator equation to ${\displaystyle \psi ,}$ we arrive at the one-dimensional (time-dependent) Schrödinger equation:

 ${\displaystyle i\hbar {\partial \psi \over \partial t}=-{\hbar ^{2} \over 2m}{\partial ^{2}\psi \over \partial x^{2}}+V\psi }$

In three spatial dimensions and with both potential energy ${\displaystyle V({\textbf {r}},t)}$ and potential momentum ${\displaystyle {\textbf {A}}({\textbf {r}},t)}$ present, we proceed from the relation ${\displaystyle E-V=({\textbf {p}}-{\textbf {A}})^{2}/2m,}$ substituting ${\displaystyle i\hbar {\partial \over \partial t}}$ for ${\displaystyle E}$ and ${\displaystyle -i\hbar {\partial \over \partial {\textbf {r}}}}$ for ${\displaystyle {\textbf {p}}.}$ The differential operator ${\displaystyle {\partial \over \partial {\textbf {r}}}}$ is a vector whose components are the differential operators ${\displaystyle \left({\partial \psi \over \partial x},{\partial \psi \over \partial y},{\partial \psi \over \partial z}\right).}$ The result:

${\displaystyle i\hbar {\partial \psi \over \partial t}={\frac {1}{2m}}\left(-i\hbar {\partial \over \partial {\textbf {r}}}-{\textbf {A}}\right)^{2}\psi +V\psi ,}$

where ${\displaystyle \psi }$ is now a function of ${\displaystyle {\textbf {r}}=(x,y,z)}$ and ${\displaystyle t.}$ This is the three-dimensional Schrödinger equation. In non-relativistic investigations (to which the Schrödinger equation is confined) the potential momentum can generally be ignored, which is why the Schrödinger equation is often given this form:

 ${\displaystyle i\hbar {\partial \psi \over \partial t}=-{\hbar ^{2} \over 2m}\left({\partial ^{2}\psi \over \partial x^{2}}+{\partial ^{2}\psi \over \partial y^{2}}+{\partial ^{2}\psi \over \partial z^{2}}\right)+V\psi }$

The free Schrödinger equation (without even the potential energy term) is satisfied by ${\displaystyle \psi (x,t)=e^{i(kx-\omega t)}}$ (in one dimension) or ${\displaystyle \psi ({\textbf {r}},t)=e^{i(\mathbf {k} \cdot \mathbf {r} -\omega t)}}$ (in three dimensions) provided that ${\displaystyle E=\hbar {\omega }}$ equals ${\displaystyle p^{2}/2m=(\hbar k)^{2}/2m,}$ which is to say: ${\displaystyle \omega (k)=\hbar k^{2}/2m.}$ However, since we are dealing with a homogeneous linear differential equation — which tells us that solutions may be added and/or multiplied by an arbitrary constant to yield additional solutions — any function of the form

${\displaystyle \psi (x,t)={1 \over {\sqrt {2\pi }}}\int {\overline {\psi }}(k)\,e^{i[kx-\omega (k)t]}dk={1 \over {\sqrt {2\pi }}}\int {\overline {\psi }}(k,t)\,e^{ikx}dk}$

with ${\displaystyle {\overline {\psi }}(k,t)={\overline {\psi }}(k)e^{-i\omega (k)t}}$ solves the (one-dimensional) Schrödinger equation. If no integration boundaries are specified, then we integrate over the real line, i.e., the integral is defined as the limit ${\displaystyle \lim _{L\rightarrow \infty }\int _{-L}^{+L}.}$ The converse also holds: every solution is of this form. The factor in front of the integral is present for purely cosmetic reasons, as you will realize presently. ${\displaystyle {\overline {\psi }}(k,t)}$ is the Fourier transform of ${\displaystyle \psi (x,t),}$ which means that

${\displaystyle {\overline {\psi }}(k,t)={1 \over {\sqrt {2\pi }}}\int \psi (x,t)\,e^{-ikx}dx.}$

The Fourier transform of ${\displaystyle \psi (x,t)}$ exists because the integral ${\displaystyle \int |\psi (x,t)|dx}$ is finite. In the next section we will come to know the physical reason why this integral is finite.

So now we have a condition that every electron "wave function" must satisfy in order to satisfy the appropriate dispersion relation. If this (and hence the Schrödinger equation) contains either or both of the potentials ${\displaystyle {\textbf {V}}}$ and ${\displaystyle {\textbf {A}}}$, then finding solutions can be tough. As a budding quantum mechanician, you will spend a considerable amount of time learning to solve the Schrödinger equation with various potentials.

## Born

In the same year that Erwin Schrödinger published the equation that now bears his name, the nonrelativistic theory was completed by Max Born's insight that the Schrödinger wave function ${\displaystyle \psi (\mathbf {r} ,t)}$ is actually nothing but a tool for calculating probabilities, and that the probability of detecting a particle "described by" ${\displaystyle \psi (\mathbf {r} ,t)}$ in a region of space ${\displaystyle R}$ is given by the volume integral

${\displaystyle \int _{R}|\psi (t,\mathbf {r} )|^{2}\,d^{3}r=\int _{R}\psi ^{*}\psi \,d^{3}r}$

— provided that the appropriate measurement is made, in this case a test for the particle's presence in ${\displaystyle R}$. Since the probability of finding the particle somewhere (no matter where) has to be 1, only a square integrable function can "describe" a particle. This rules out ${\displaystyle \psi (\mathbf {r} )=e^{i\mathbf {k} \cdot \mathbf {r} },}$ which is not square integrable. In other words, no particle can have a momentum so sharp as to be given by ${\displaystyle \hbar }$ times a wave vector ${\displaystyle \mathbf {k} }$, rather than by a genuine probability distribution over different momenta.

Given a probability density function ${\displaystyle |\psi (x)|^{2}}$, we can define the expected value

${\displaystyle \langle x\rangle =\int |\psi (x)|^{2}\,x\,dx=\int \psi ^{*}\,x\,\psi \,dx}$

and the standard deviation  ${\displaystyle \Delta x={\sqrt {\int |\psi |^{2}(x-\langle x\rangle )^{2}}}}$

as well as higher moments of ${\displaystyle |\psi (x)|^{2}}$. By the same token,

${\displaystyle \langle k\rangle =\int {\overline {\psi }}\,^{*}\,k\,{\overline {\psi }}\,dk}$  and  ${\displaystyle \Delta k={\sqrt {\int |{\overline {\psi }}|^{2}(k-\langle k\rangle )^{2}}}.}$

Here is another expression for ${\displaystyle \langle k\rangle :}$

${\displaystyle \langle k\rangle =\int \psi ^{*}(x)\left(-i{\frac {d}{dx}}\right)\psi (x)\,dx.}$

To check that the two expressions are in fact equal, we plug  ${\displaystyle \psi (x)=(2\pi )^{-1/2}\int {\overline {\psi }}(k)\,e^{ikx}dk}$  into the latter expression:

${\displaystyle \langle k\rangle ={\frac {1}{\sqrt {2\pi }}}\int \psi ^{*}(x)\left(-i{\frac {d}{dx}}\right)\int {\overline {\psi }}(k)\,e^{ikx}dk\,dx={\frac {1}{\sqrt {2\pi }}}\int \psi ^{*}(x)\int {\overline {\psi }}(k)\,k\,e^{ikx}dk\,dx.}$

Next we replace ${\displaystyle \psi ^{*}(x)}$ by ${\displaystyle (2\pi )^{-1/2}\int {\overline {\psi }}\,^{*}(k')\,e^{-ik'x}dk'}$  and shuffle the integrals with the mathematical nonchalance that is common in physics:

${\displaystyle \langle k\rangle =\int \!\int {\overline {\psi }}\,^{*}(k')\,k\,{\overline {\psi }}(k)\left[{\frac {1}{2\pi }}\int e^{i(k-k')x}dx\right]dk\,dk'.}$

The expression in square brackets is a representation of Dirac's delta distribution ${\displaystyle \delta (k-k'),}$ the defining characteristic of which is  ${\displaystyle \int _{-\infty }^{+\infty }f(x)\,\delta (x)\,dx=f(0)}$  for any continuous function ${\displaystyle f(x).}$ (In case you didn't notice, this proves what was to be proved.)

## Heisenberg

In the same annus mirabilis of quantum mechanics, 1926, Werner Heisenberg proved the so-called "uncertainty" relation

${\displaystyle \Delta x\,\Delta p\geq \hbar /2.}$

Heisenberg spoke of Unschärfe, the literal translation of which is "fuzziness" rather than "uncertainty". Since the relation ${\displaystyle \Delta x\,\Delta k\geq 1/2}$ is a consequence of the fact that ${\displaystyle \psi (x)}$ and ${\displaystyle {\overline {\psi }}(k)}$ are related to each other via a Fourier transformation, we leave the proof to the mathematicians. The fuzziness relation for position and momentum follows via ${\displaystyle p=\hbar k}$. It says that the fuzziness of a position (as measured by ${\displaystyle \Delta x}$ ) and the fuzziness of the corresponding momentum (as measured by ${\displaystyle \Delta p=\hbar \Delta k}$ ) must be such that their product equals at least ${\displaystyle \hbar /2.}$

# The Feynman route to Schrödinger

The probabilities of the possible outcomes of measurements performed at a time ${\displaystyle t_{2}}$ are determined by the Schrödinger wave function ${\displaystyle \psi (\mathbf {r} ,t_{2})}$. The wave function ${\displaystyle \psi (\mathbf {r} ,t_{2})}$ is determined via the Schrödinger equation by ${\displaystyle \psi (\mathbf {r} ,t_{1}).}$ What determines ${\displaystyle \psi (\mathbf {r} ,t_{1})}$ ? Why, the outcome of a measurement performed at ${\displaystyle t_{1}}$ — what else? Actual measurement outcomes determine the probabilities of possible measurement outcomes.

## Two rules

In this chapter we develop the quantum-mechanical probability algorithm from two fundamental rules. To begin with, two definitions:

• Alternatives are possible sequences of measurement outcomes.
• With each alternative is associated a complex number called amplitude.

Suppose that you want to calculate the probability of a possible outcome of a measurement given the actual outcome of an earlier measurement. Here is what you have to do:

• Choose any sequence of measurements that may be made in the meantime.
• Assign an amplitude to each alternative.
• Apply either of the following rules:

Rule A: If the intermediate measurements are made (or if it is possible to infer from other measurements what their outcomes would have been if they had been made), first square the absolute values of the amplitudes of the alternatives and then add the results.
Rule B: If the intermediate measurements are not made (and if it is not possible to infer from other measurements what their outcomes would have been), first add the amplitudes of the alternatives and then square the absolute value of the result.

In subsequent sections we will explore the consequences of these rules for a variety of setups, and we will think about their origin — their raison d'être. Here we shall use Rule B to determine the interpretation of ${\displaystyle {\overline {\psi }}(k)}$ given Born's probabilistic interpretation of ${\displaystyle \psi (x)}$.

In the so-called "continuum normalization", the unphysical limit of a particle with a sharp momentum ${\displaystyle \hbar k'}$ is associated with the wave function

${\displaystyle \psi _{k'}(x,t)={\frac {1}{\sqrt {2\pi }}}\int \delta (k-k')\,e^{i[kx-\omega (k)t]}dk={\frac {1}{\sqrt {2\pi }}}\,e^{i[k'x-\omega (k')t]}.}$

Hence we may write ${\displaystyle \psi (x,t)=\int {\overline {\psi }}(k)\,\psi _{k}(x,t)\,dk.}$

${\displaystyle {\overline {\psi }}(k)}$ is the amplitude for the outcome ${\displaystyle \hbar k}$ of an infinitely precise momentum measurement. ${\displaystyle \psi _{k}(x,t)}$ is the amplitude for the outcome ${\displaystyle x}$ of an infinitely precise position measurement performed (at time t) subsequent to an infinitely precise momentum measurement with outcome ${\displaystyle \hbar k.}$ And ${\displaystyle \psi (x,t)}$ is the amplitude for obtaining ${\displaystyle x}$ by an infinitely precise position measurement performed at time ${\displaystyle t.}$

The preceding equation therefore tells us that the amplitude for finding ${\displaystyle x}$ at ${\displaystyle t}$ is the product of

1. the amplitude for the outcome ${\displaystyle \hbar k}$ and
2. the amplitude for the outcome ${\displaystyle x}$ (at time ${\displaystyle t}$) subsequent to a momentum measurement with outcome ${\displaystyle \hbar k,}$

summed over all values of ${\displaystyle k.}$

Under the conditions stipulated by Rule A, we would have instead that the probability for finding ${\displaystyle x}$ at ${\displaystyle t}$ is the product of

1. the probability for the outcome ${\displaystyle \hbar k}$ and
2. the probability for the outcome ${\displaystyle x}$ (at time ${\displaystyle t}$) subsequent to a momentum measurement with outcome ${\displaystyle \hbar k,}$

summed over all values of ${\displaystyle k.}$

The latter is what we expect on the basis of standard probability theory. But if this holds under the conditions stipulated by Rule A, then the same holds with "amplitude" substituted from "probability" under the conditions stipulated by Rule B. Hence, given that ${\displaystyle \psi _{k}(x,t)}$ and ${\displaystyle \psi (x,t)}$ are amplitudes for obtaining the outcome ${\displaystyle x}$ in an infinitely precise position measurement, ${\displaystyle {\overline {\psi }}(k)}$ is the amplitude for obtaining the outcome ${\displaystyle \hbar k}$ in an infinitely precise momentum measurement.

Notes:

1. Since Rule B stipulates that the momentum measurement is not actually made, we need not worry about the impossibility of making an infinitely precise momentum measurement.
2. If we refer to ${\displaystyle |\psi (x)|^{2}}$ as "the probability of obtaining the outcome ${\displaystyle x,}$" what we mean is that ${\displaystyle |\psi (x)|^{2}}$ integrated over any interval or subset of the real line is the probability of finding our particle in this interval or subset.

## An experiment with two slits

The setup

In this experiment, the final measurement (to the possible outcomes of which probabilities are assigned) is the detection of an electron at the backdrop, by a detector situated at D (D being a particular value of x). The initial measurement outcome, on the basis of which probabilities are assigned, is the launch of an electron by an electron gun G. (Since we assume that G is the only source of free electrons, the detection of an electron behind the slit plate also indicates the launch of an electron in front of the slit plate.) The alternatives or possible intermediate outcomes are

• the electron went through the left slit (L),
• the electron went through the right slit (R).

The corresponding amplitudes are ${\displaystyle A_{L}}$ and ${\displaystyle A_{R}.}$

Here is what we need to know in order to calculate them:

• ${\displaystyle A_{L}}$ is the product of two complex numbers, for which we shall use the symbols ${\displaystyle \langle D|L\rangle }$ and ${\displaystyle \langle L|G\rangle .}$
• By the same token, ${\displaystyle A_{R}=\langle D|R\rangle \,\langle R|G\rangle .}$
• The absolute value of ${\displaystyle \langle B|A\rangle }$ is inverse proportional to the distance ${\displaystyle d(BA)}$ between A and B.
• The phase of ${\displaystyle \langle B|A\rangle }$ is proportional to ${\displaystyle d(BA).}$

For obvious reasons ${\displaystyle \langle B|A\rangle }$ is known as a propagator.

### Why product?

Recall the fuzziness ("uncertainty") relation, which implies that ${\displaystyle \Delta p\rightarrow \infty }$ as ${\displaystyle \Delta x\rightarrow 0.}$ In this limit the particle's momentum is completely indefinite or, what comes to the same, has no value at all. As a consequence, the probability of finding a particle at B, given that it was last "seen" at A, depends on the initial position A but not on any initial momentum, inasmuch as there is none. Hence whatever the particle does after its detection at A is independent of what it did before then. In probability-theoretic terms this means that the particle's propagation from G to L and its propagation from L to D are independent events. So the probability of propagation from G to D via L is the product of the corresponding probabilities, and so the amplitude of propagation from G to D via L is the product ${\displaystyle \langle D|L\,\rangle \langle L|G\rangle }$ of the corresponding amplitudes.

### Why is the absolute value inverse proportional to the distance?

Imagine (i) a sphere of radius ${\displaystyle r}$ whose center is A and (ii) a detector monitoring a unit area of the surface of this sphere. Since the total surface area is proportional to ${\displaystyle r^{2},}$ and since for a free particle the probability of detection per unit area is constant over the entire surface (explain why!), the probability of detection per unit area is inverse proportional to ${\displaystyle r^{2}.}$ The absolute value of the amplitude of detection per unit area, being the square root of the probability, is therefore inverse proportional to ${\displaystyle r.}$

### Why is the phase proportional to the distance?

The multiplicativity of successive propagators implies the additivity of their phases. Together with the fact that, in the case of a free particle, the propagator ${\displaystyle \langle B|A\rangle }$ (and hence its phase) can only depend on the distance between A and B, it implies the proportionality of the phase of ${\displaystyle \langle B|A\rangle }$ to ${\displaystyle d(BA).}$

### Calculating the interference pattern

According to Rule A, the probability of detecting at D an electron launched at G is

${\displaystyle p_{A}(D)=|\langle D|L\rangle \,\langle L|G\rangle |^{2}+|\langle D|R\rangle \,\langle R|G\rangle |^{2}.}$

If the slits are equidistant from G, then ${\displaystyle \langle L|G\rangle }$ and ${\displaystyle \langle R|G\rangle }$ are equal and ${\displaystyle p_{A}(D)}$ is proportional to

${\displaystyle |\langle D|L\rangle |^{2}+|\langle D|R\rangle |^{2}=1/d^{2}(DL)+1/d^{2}(DR).}$

Here is the resulting plot of ${\displaystyle p_{A}}$ against the position ${\displaystyle x}$ of the detector:

Predicted relative frequency of detection according to Rule A

${\displaystyle p_{A}(x)}$ (solid line) is the sum of two distributions (dotted lines), one for the electrons that went through L and one for the electrons that went through R.

According to Rule B, the probability ${\displaystyle p_{B}(D)}$ of detecting at D an electron launched at G is proportional to

${\displaystyle |\langle D|L\rangle +\langle D|R\rangle |^{2}=1/d^{2}(DL)+1/d^{2}(DR)+2\cos(k\Delta )/[d(DL)\,d(DR)],}$

where ${\displaystyle \Delta }$ is the difference ${\displaystyle d(DR)-d(DL)}$ and ${\displaystyle k=p/\hbar }$ is the wavenumber, which is sufficiently sharp to be approximated by a number. (And it goes without saying that you should check this result.)

Here is the plot of ${\displaystyle p_{B}}$ against ${\displaystyle x}$ for a particular set of values for the wavenumber, the distance between the slits, and the distance between the slit plate and the backdrop:

Predicted relative frequency of detection according to Rule B

Observe that near the minima the probability of detection is less if both slits are open than it is if one slit is shut. It is customary to say that destructive interference occurs at the minima and that constructive interference occurs at the maxima, but do not think of this as the description of a physical process. All we mean by "constructive interference" is that a probability calculated according to Rule B is greater than the same probability calculated according to Rule A, and all we mean by "destructive interference" is that a probability calculated according to Rule B is less than the same probability calculated according to Rule A.

Here is how an interference pattern builds up over time[1]:

1. A. Tonomura, J. Endo, T. Matsuda, T. Kawasaki, & H. Ezawa, "Demonstration of single-electron buildup of an interference pattern", American Journal of Physics 57, 117-120, 1989.

## Bohm's story

### Hidden Variables

Suppose that the conditions stipulated by Rule B are met: there is nothing — no event, no state of affairs, anywhere, anytime — from which the slit taken by an electron can be inferred. Can it be true, in this case,

• that each electron goes through a single slit — either L or R — and
• that the behavior of an electron that goes through one slit does not depend on whether the other slit is open or shut?

To keep the language simple, we will say that an electron leaves a mark where it is detected at the backdrop. If each electron goes through a single slit, then the observed distribution of marks when both slits are open is the sum of two distributions, one from electrons that went through L and one from electrons that went through R:

${\displaystyle p_{B}(x)=p_{L}(x)+p_{R}(x)\,\!}$

If in addition the behavior of an electron that goes through one slit does not depend on whether the other slit is open or shut, then we can observe ${\displaystyle p_{L}(x)}$ by keeping R shut, and we can observe ${\displaystyle p_{R}(x)}$ by keeping L shut. What we observe if R is shut is the left dashed hump, and what we observed if L is shut is the right dashed hump:

Hence if the above two conditions (as well as those stipulated by Rule B) are satisfied, we will see the sum of these two humps. In reality what we see is this:

Thus all of those conditions cannot be simultaneously satisfied. If Rule B applies, then either it is false that each electron goes through a single slit or the behavior of an electron that goes through one slit does depend on whether the other slit is open or shut.

Which is it?

According to one attempt to make physical sense of the mathematical formalism of quantum mechanics, due to Louis de Broglie and David Bohm, each electron goes through a single slit, and the behavior of an electron that goes through one slit depends on whether the other slit is open or shut.

So how does the state of, say, the right slit (open or shut) affect the behavior of an electron that goes through the left slit? In both de Broglie's pilot wave theory and Bohmian mechanics, the electron is assumed to be a well-behaved particle in the sense that it follows a precise path — its position at any moment is given by three coordinates — and in addition there exists a wave that guides the electron by exerting on it a force. If only one slit is open, this passes through one slit. If both slits are open, this passes through both slits and interferes with itself (in the "classical" sense of interference). As a result, it guides the electrons along wiggly paths that cluster at the backdrop so as to produce the observed interference pattern:

According to this story, the reason why electrons coming from the same source or slit arrive in different places, is that they start out in slightly different directions and/or with slightly different speeds. If we had exact knowledge of their initial positions and momenta, we could make an exact prediction of each electron's subsequent motion. This, however, is impossible. The [[../../Serious illnesses/Born#Heisenberg|uncertainty principle ]] prevents us from making exact predictions of a particle's motion. Hence even though according to Bohm the initial positions and momenta are in possession of precise values, we can never know them.

If positions and momenta have precise values, then why can we not measure them? It used to be said that this is because a measurement exerts an uncontrollable influence on the value of the observable being measured. Yet this merely raises another question: why do measurements exert uncontrollable influences? This may be true for all practical purposes, but the uncertainty principle does not say that ${\displaystyle \Delta x\,\Delta p\geq \hbar /2}$ merely holds for all practical purposes. Moreover, it isn't the case that measurements necessarily "disturb" the systems on which they are performed.

The statistical element of quantum mechanics is an essential feature of the theory. The postulate of an underlying determinism, which in order to be consistent with the theory has to be a crypto-determinism, not only adds nothing to our understanding of the theory but also precludes any proper understanding of this essential feature of the theory. There is, in fact, a simple and obvious reason why hidden variables are hidden: the reason why they are strictly (rather than merely for all practical purposes) unobservable is that they do not exist.

At one time Einstein insisted that theories ought to be formulated without reference to unobservable quantities. When Heisenberg later mentioned to Einstein that this maxim had guided him in his discovery of the uncertainty principle, Einstein replied something to this effect: "Even if I once said so, it is nonsense." His point was that before one has a theory, one cannot know what is observable and what is not. Our situation here is different. We have a theory, and this tells in no uncertain terms what is observable and what is not.

## Propagator for a free and stable particle

### The propagator as a path integral

Suppose that we make m intermediate position measurements at fixed intervals of duration ${\displaystyle \Delta t.}$ Each of these measurements is made with the help of an array of detectors monitoring n mutually disjoint regions ${\displaystyle R_{k},}$ ${\displaystyle k=1,\dots ,n.}$ Under the conditions stipulated by Rule B, the propagator ${\displaystyle \langle B|A\rangle }$ now equals the sum of amplitudes

${\displaystyle \sum _{k_{1}=1}^{n}\cdots \sum _{k_{m}=1}^{n}\langle B|R_{k_{m}}\rangle \cdots \langle R_{k_{2}}|R_{k_{1}}\rangle \,\langle R_{k_{1}}|A\rangle .}$

It is not hard to see what happens in the double limit ${\displaystyle \Delta t\rightarrow 0}$ (which implies that ${\displaystyle m\rightarrow \infty }$) and ${\displaystyle n\rightarrow \infty .}$ The multiple sum ${\displaystyle \sum _{k_{1}=1}^{n}\cdots \sum _{k_{m}=1}^{n}}$ becomes an integral ${\displaystyle \int \!{\mathcal {DC}}}$ over continuous spacetime paths from A to B, and the amplitude ${\displaystyle \langle B|R_{k_{m}}\rangle \cdots \langle R_{k_{1}}|A\rangle }$ becomes a complex-valued functional ${\displaystyle Z[{\mathcal {C}}:A\rightarrow B]}$ — a complex function of continuous functions representing continuous spacetime paths from A to B:

${\displaystyle \langle B|A\rangle =\int \!{\mathcal {DC}}\,Z[{\mathcal {C}}:A\rightarrow B]}$

The integral ${\displaystyle \int \!{\mathcal {DC}}}$ is not your standard Riemann integral ${\displaystyle \int _{a}^{b}dx\,f(x),}$ to which each infinitesimal interval ${\displaystyle dx}$ makes a contribution proportional to the value that ${\displaystyle f(x)}$ takes inside the interval, but a functional or path integral, to which each "bundle" of paths of infinitesimal width ${\displaystyle {\mathcal {DC}}}$ makes a contribution proportional to the value that ${\displaystyle Z[{\mathcal {C}}]}$ takes inside the bundle.

As it stands, the path integral ${\displaystyle \int \!{\mathcal {DC}}}$ is just the idea of an idea. Appropriate evalutation methods have to be devised on a more or less case-by-case basis.

### A free particle

Now pick any path ${\displaystyle {\mathcal {C}}}$ from A to B, and then pick any infinitesimal segment ${\displaystyle d{\mathcal {C}}}$ of ${\displaystyle {\mathcal {C}}}$. Label the start and end points of ${\displaystyle d{\mathcal {C}}}$ by inertial coordinates ${\displaystyle t,x,y,z}$ and ${\displaystyle t+dt,x+dx,y+dy,z+dz,}$ respectively. In the general case, the amplitude ${\displaystyle Z(d{\mathcal {C}})}$ will be a function of ${\displaystyle t,x,y,z}$ and ${\displaystyle dt,dx,dy,dz.}$ In the case of a free particle, ${\displaystyle Z(d{\mathcal {C}})}$ depends neither on the position of ${\displaystyle d{\mathcal {C}}}$ in spacetime (given by ${\displaystyle t,x,y,z}$) nor on the spacetime orientiaton of ${\displaystyle d{\mathcal {C}}}$ (given by the four-velocity ${\displaystyle (c\,dt/ds,dx/ds,dy/ds,dz/ds)}$ but only on the proper time interval ${\displaystyle ds={\sqrt {dt^{2}-(dx^{2}+dy^{2}+dz^{2})/c^{2}}}.}$

(Because its norm equals the speed of light, the four-velocity depends on three rather than four independent parameters. Together with ${\displaystyle ds,}$ they contain the same information as the four independent numbers ${\displaystyle dt,dx,dy,dz.}$)

Thus for a free particle ${\displaystyle Z(d{\mathcal {C}})=Z(ds).}$ With this, the multiplicativity of successive propagators tells us that

${\displaystyle \prod _{j}Z(ds_{j})=Z{\Bigl (}\sum _{j}ds_{j}{\Bigr )}\longrightarrow Z{\Bigl (}\int _{\mathcal {C}}ds{\Bigr )}}$

It follows that there is a complex number ${\displaystyle z}$ such that ${\displaystyle Z[{\mathcal {C}}]=e^{z\,s[{\mathcal {C}}:A\rightarrow B]},}$ where the line integral ${\displaystyle s[{\mathcal {C}}:A\rightarrow B]=\int _{\mathcal {C}}ds}$ gives the time that passes on a clock as it travels from A to B via ${\displaystyle {\mathcal {C}}.}$

### A free and stable particle

By integrating ${\displaystyle {\bigl |}\langle B|A\rangle {\bigr |}^{2}}$ (as a function of ${\displaystyle \mathbf {r} _{B}}$) over the whole of space, we obtain the probability of finding that a particle launched at the spacetime point ${\displaystyle t_{A},\mathbf {r} _{A}}$ still exists at the time ${\displaystyle t_{B}.}$ For a stable particle this probability equals 1:

${\displaystyle \int \!d^{3}r_{B}\left|\langle t_{B},\mathbf {r} _{B}|t_{A},\mathbf {r} _{A}\rangle \right|^{2}=\int \!d^{3}r_{B}\left|\int \!{\mathcal {DC}}\,e^{z\,s[{\mathcal {C}}:A\rightarrow B]}\right|^{2}=1}$

If you contemplate this equation with a calm heart and an open mind, you will notice that if the complex number ${\displaystyle z=a+ib}$ had a real part ${\displaystyle a\neq 0,}$ then the integral between the two equal signs would either blow up ${\displaystyle (a>0)}$ or drop off ${\displaystyle (a<0)}$ exponentially as a function of ${\displaystyle t_{B}}$, due to the exponential factor ${\displaystyle e^{a\,s[{\mathcal {C}}]}}$.

### Meaning of mass

The propagator for a free and stable particle thus has a single "degree of freedom": it depends solely on the value of ${\displaystyle b.}$ If proper time is measured in seconds, then ${\displaystyle b}$ is measured in radians per second. We may think of ${\displaystyle e^{ib\,s},}$ with ${\displaystyle s}$ a proper-time parametrization of ${\displaystyle {\mathcal {C}},}$ as a clock carried by a particle that travels from A to B via ${\displaystyle {\mathcal {C}},}$ provided we keep in mind that we are thinking of an aspect of the mathematical formalism of quantum mechanics rather than an aspect of the real world.

It is customary

• to insert a minus (so the clock actually turns clockwise!): ${\displaystyle Z=e^{-ib\,s[{\mathcal {C}}]},}$
• to multiply by ${\displaystyle 2\pi }$ (so that we may think of ${\displaystyle b}$ as the rate at which the clock "ticks" — the number of cycles it completes each second): ${\displaystyle Z=e^{-i\,2\pi \,b\,s[{\mathcal {C}}]},}$
• to divide by Planck's constant ${\displaystyle h}$ (so that ${\displaystyle b}$ is measured in energy units and called the rest energy of the particle): ${\displaystyle Z=e^{-i(2\pi /h)\,b\,s[{\mathcal {C}}]}=e^{-(i/\hbar )\,b\,s[{\mathcal {C}}]},}$
• and to multiply by ${\displaystyle c^{2}}$ (so that ${\displaystyle b}$ is measured in mass units and called the particle's rest mass): ${\displaystyle Z=e^{-(i/\hbar )\,b\,c^{2}\,s[{\mathcal {C}}]}.}$

The purpose of using the same letter ${\displaystyle b}$ everywhere is to emphasize that it denotes the same physical quantity, merely measured in different units. If we use natural units in which ${\displaystyle \hbar =c=1,}$ rather than conventional ones, the identity of the various ${\displaystyle b}$'s is immediately obvious.

## From quantum to classical

### Action

Let's go back to the propagator

${\displaystyle \langle B|A\rangle =\int \!{\mathcal {DC}}\,Z[{\mathcal {C}}:A\rightarrow B].}$

For a free and stable particle we found that

${\displaystyle Z[{\mathcal {C}}]=e^{-(i/\hbar )\,m\,c^{2}\,s[{\mathcal {C}}]},\qquad s[{\mathcal {C}}]=\int _{\mathcal {C}}ds,}$

where ${\displaystyle ds={\sqrt {dt^{2}-(dx^{2}+dy^{2}+dz^{2})/c^{2}}}}$ is the proper-time interval associated with the path element ${\displaystyle d{\mathcal {C}}}$. For the general case we found that the amplitude ${\displaystyle Z(d{\mathcal {C}})}$ is a function of ${\displaystyle t,x,y,z}$ and ${\displaystyle dt,dx,dy,dz}$ or, equivalently, of the coordinates ${\displaystyle t,x,y,z}$, the components ${\displaystyle c\,dt/ds,dx/ds,dy/ds,dz/ds}$ of the 4-velocity, as well as ${\displaystyle ds}$. For a particle that is stable but not free, we obtain, by the same argument that led to the above amplitude,

${\displaystyle Z[{\mathcal {C}}]=e^{(i/\hbar )\,S[{\mathcal {C}}]},}$

where we have introduced the functional ${\displaystyle S[{\mathcal {C}}]=\int _{\mathcal {C}}dS}$, which goes by the name action.

For a free and stable particle, ${\displaystyle S[{\mathcal {C}}]}$ is the proper time (or proper duration) ${\displaystyle s[{\mathcal {C}}]=\int _{\mathcal {C}}ds}$ multiplied by ${\displaystyle -mc^{2}}$, and the infinitesimal action ${\displaystyle dS[d{\mathcal {C}}]}$ is proportional to ${\displaystyle ds}$:

${\displaystyle S[{\mathcal {C}}]=-m\,c^{2}\,s[{\mathcal {C}}],\qquad dS[d{\mathcal {C}}]=-m\,c^{2}\,ds.}$

Let's recap. We know all about the motion of a stable particle if we know how to calculate the probability ${\displaystyle p(B|A)}$ (in all circumstances). We know this if we know the amplitude ${\displaystyle \langle B|A\rangle }$. We know the latter if we know the functional ${\displaystyle Z[{\mathcal {C}}]}$. And we know this functional if we know the infinitesimal action ${\displaystyle dS(t,x,y,z,dt,dx,dy,dz)}$ or ${\displaystyle dS(t,\mathbf {r} ,dt,d\mathbf {r} )}$ (in all circumstances).

What do we know about ${\displaystyle dS}$?

The multiplicativity of successive propagators implies the additivity of actions associated with neighboring infinitesimal path segments ${\displaystyle d{\mathcal {C}}_{1}}$ and ${\displaystyle d{\mathcal {C}}_{2}}$. In other words,

${\displaystyle e^{(i/\hbar )\,dS(d{\mathcal {C}}_{1}+d{\mathcal {C}}_{2})}=e^{(i/\hbar )\,dS(d{\mathcal {C}}_{2})}\,e^{(i/\hbar )\,dS(d{\mathcal {C}}_{1})}}$

implies

${\displaystyle dS(d{\mathcal {C}}_{1}+d{\mathcal {C}}_{2})=dS(d{\mathcal {C}}_{1})+dS(d{\mathcal {C}}_{2}).}$

It follows that the differential ${\displaystyle dS}$ is homogeneous (of degree 1) in the differentials ${\displaystyle dt,d\mathbf {r} }$:

${\displaystyle dS(t,\mathbf {r} ,\lambda \,dt,\lambda \,d\mathbf {r} )=\lambda \,dS(t,\mathbf {r} ,dt,d\mathbf {r} ).}$

This property of ${\displaystyle dS}$ makes it possible to think of the action ${\displaystyle S[{\mathcal {C}}]}$ as a (particle-specific) length associated with ${\displaystyle {\mathcal {C}}}$, and of ${\displaystyle dS}$ as defining a (particle-specific) spacetime geometry. By substituting ${\displaystyle 1/dt}$ for ${\displaystyle \lambda }$ we get:

${\displaystyle dS(t,\mathbf {r} ,\mathbf {v} )={\frac {dS}{dt}}.}$

Something is wrong, isn't it? Since the right-hand side is now a finite quantity, we shouldn't use the symbol ${\displaystyle dS}$ for the left-hand side. What we have actually found is that there is a function ${\displaystyle L(t,\mathbf {r} ,\mathbf {v} )}$, which goes by the name Lagrange function, such that ${\displaystyle dS=L\,dt}$.

### Geodesic equations

Consider a spacetime path ${\displaystyle {\mathcal {C}}}$ from ${\displaystyle A}$ to ${\displaystyle B.}$ Let's change ("vary") it in such a way that every point ${\displaystyle (t,\mathbf {r} )}$ of ${\displaystyle {\mathcal {C}}}$ gets shifted by an infinitesimal amount to a corresponding point ${\displaystyle (t+\delta t,\mathbf {r} +\delta \mathbf {r} ),}$ except the end points, which are held fixed: ${\displaystyle \delta t=0}$ and ${\displaystyle \delta \mathbf {r} =0}$ at both ${\displaystyle A}$ and ${\displaystyle B.}$

If ${\displaystyle t\rightarrow t+\delta t,}$ then ${\displaystyle dt=t_{2}-t_{1}\longrightarrow t_{2}+\delta t_{2}-(t_{1}+\delta t_{1})=(t_{2}-t_{1})+(\delta t_{2}-\delta t_{1})=dt+d\delta t.}$

By the same token, ${\displaystyle d\mathbf {r} \rightarrow d\mathbf {r} +d\delta \mathbf {r} .}$

In general, the change ${\displaystyle {\mathcal {C}}\rightarrow {\mathcal {C}}'}$ will cause a corresponding change in the action: ${\displaystyle S[{\mathcal {C}}]\rightarrow S[{\mathcal {C}}']\neq S[{\mathcal {C}}].}$ If the action does not change (that is, if it is stationary at ${\displaystyle {\mathcal {C}}}$ ),

${\displaystyle \delta S=\int _{{\mathcal {C}}'}dS-\int _{\mathcal {C}}dS=0,}$

then ${\displaystyle {\mathcal {C}}}$ is a geodesic of the geometry defined by ${\displaystyle dS.}$ (A function ${\displaystyle f(x)}$ is stationary at those values of ${\displaystyle x}$ at which its value does not change if ${\displaystyle x}$ changes infinitesimally. By the same token we call a functional ${\displaystyle S[{\mathcal {C}}]}$ stationary if its value does not change if ${\displaystyle {\mathcal {C}}}$ changes infinitesimally.)

To obtain a handier way to characterize geodesics, we begin by expanding

${\displaystyle dS({\mathcal {C}}')=dS(t+\delta t,\mathbf {r} +\delta \mathbf {r} ,dt+d\delta t,d\mathbf {r} +d\delta \mathbf {r} )}$
${\displaystyle =dS(t,\mathbf {r} ,dt,d\mathbf {r} )+{\frac {\partial dS}{\partial t}}\,\delta t+{\frac {\partial dS}{\partial \mathbf {r} }}\cdot \delta \mathbf {r} +{\frac {\partial dS}{\partial dt}}\,d\delta t+{\frac {\partial dS}{\partial d\mathbf {r} }}\cdot d\delta \mathbf {r} .}$

This gives us

${\displaystyle (^{*})\quad \int _{{\mathcal {C}}'}dS-\int _{\mathcal {C}}dS=\int _{\mathcal {C}}\left[{\partial dS \over \partial t}\delta t+{\partial dS \over \partial \mathbf {r} }\cdot \delta \mathbf {r} +{\partial dS \over \partial dt}d\,\delta t+{\partial dS \over \partial d\mathbf {r} }\cdot d\,\delta \mathbf {r} \right].}$

Next we use the product rule for derivatives,

${\displaystyle d\left({\partial dS \over \partial dt}\delta t\right)=\left(d{\partial dS \over \partial dt}\right)\delta t+{\partial dS \over \partial dt}d\,\delta t,}$
${\displaystyle d\left({\partial dS \over \partial d\mathbf {r} }\cdot \delta \mathbf {r} \right)=\left(d{\partial dS \over \partial d\mathbf {r} }\right)\cdot \delta \mathbf {r} +{\partial dS \over \partial d\mathbf {r} }\cdot d\,\delta \mathbf {r} ,}$

to replace the last two terms of (*), which takes us to

${\displaystyle \delta S=\int \left[\left({\partial dS \over \partial t}-d{\partial dS \over \partial dt}\right)\delta t+\left({\partial dS \over \partial \mathbf {r} }-d{\partial dS \over \partial d\mathbf {r} }\right)\cdot \delta \mathbf {r} \right]+\int d\left({\partial dS \over \partial dt}\delta t+{\partial dS \over \partial d\mathbf {r} }\cdot \delta \mathbf {r} \right).}$

The second integral vanishes because it is equal to the difference between the values of the expression in brackets at the end points ${\displaystyle A}$ and ${\displaystyle B,}$ where ${\displaystyle \delta t=0}$ and ${\displaystyle \delta \mathbf {r} =0.}$ If ${\displaystyle {\mathcal {C}}}$ is a geodesic, then the first integral vanishes, too. In fact, in this case ${\displaystyle \delta S=0}$ must hold for all possible (infinitesimal) variations ${\displaystyle \delta t}$ and ${\displaystyle \delta \mathbf {r} ,}$ whence it follows that the integrand of the first integral vanishes. The bottom line is that the geodesics defined by ${\displaystyle dS}$ satisfy the geodesic equations

 ${\displaystyle {\partial dS \over \partial t}=d\,{\partial dS \over \partial dt},\qquad {\partial dS \over \partial \mathbf {r} }=d\,{\partial dS \over \partial d\mathbf {r} }.}$

### Principle of least action

If an object travels from ${\displaystyle A}$ to ${\displaystyle B,}$ it travels along all paths from ${\displaystyle A}$ to ${\displaystyle B,}$ in the same sense in which an electron goes through both slits. Then how is it that a big thing (such as a planet, a tennis ball, or a mosquito) appears to move along a single well-defined path?

There are at least two reasons. One of them is that the bigger an object is, the harder it is to satisfy the conditions stipulated by Rule ${\displaystyle B.}$ Another reason is that even if these conditions are satisfied, the likelihood of finding an object of mass ${\displaystyle m}$ where according to the laws of classical physics it should not be, decreases as ${\displaystyle m}$ increases.

To see this, we need to take account of the fact that it is strictly impossible to check whether an object that has travelled from ${\displaystyle A}$ to ${\displaystyle B,}$ has done so along a mathematically precise path ${\displaystyle {\mathcal {C}}.}$ Let us make the half realistic assumption that what we can check is whether an object has travelled from ${\displaystyle A}$ to ${\displaystyle B}$ within a a narrow bundle of paths — the paths contained in a narrow tube ${\displaystyle {\mathcal {T}}.}$ The probability of finding that it has, is the absolute square of the path integral ${\displaystyle I({\mathcal {T}})=\int _{\mathcal {T}}{\mathcal {DC}}e^{(i/\hbar )S[{\mathcal {C}}]},}$ which sums over the paths contained in ${\displaystyle {\mathcal {T}}.}$

Let us assume that there is exactly one path from ${\displaystyle A}$ to ${\displaystyle B}$ for which ${\displaystyle S[{\mathcal {C}}]}$ is stationary: its length does not change if we vary the path ever so slightly, no matter how. In other words, we assume that there is exactly one geodesic. Let's call it ${\displaystyle {\mathcal {G}},}$ and let's assume it lies in ${\displaystyle {\mathcal {T}}.}$

No matter how rapidly the phase ${\displaystyle S[{\mathcal {C}}]/\hbar }$ changes under variation of a generic path ${\displaystyle {\mathcal {C}},}$ it will be stationary at ${\displaystyle {\mathcal {G}}.}$ This means, loosly speaking, that a large number of paths near ${\displaystyle {\mathcal {G}}}$ contribute to ${\displaystyle I({\mathcal {T}})}$ with almost equal phases. As a consequence, the magnitude of the sum of the corresponding phase factors ${\displaystyle e^{(i/\hbar )S[{\mathcal {C}}]}}$ is large.

If ${\displaystyle S[{\mathcal {C}}]/\hbar }$ is not stationary at ${\displaystyle {\mathcal {C}},}$ all depends on how rapidly it changes under variation of ${\displaystyle {\mathcal {C}}.}$ If it changes sufficiently rapidly, the phases associated with paths near ${\displaystyle {\mathcal {C}}}$ are more or less equally distributed over the interval ${\displaystyle [0,2\pi ],}$ so that the corresponding phase factors add up to a complex number of comparatively small magnitude. In the limit ${\displaystyle S[{\mathcal {C}}]/\hbar \rightarrow \infty ,}$ the only significant contributions to ${\displaystyle I({\mathcal {T}})}$ come from paths in the infinitesimal neighborhood of ${\displaystyle {\mathcal {G}}.}$

We have assumed that ${\displaystyle {\mathcal {G}}}$ lies in ${\displaystyle {\mathcal {T}}.}$ If it does not, and if ${\displaystyle S[{\mathcal {C}}]/\hbar }$ changes sufficiently rapidly, the phases associated with paths near any path in ${\displaystyle {\mathcal {T}}}$ are more or less equally distributed over the interval ${\displaystyle [0,2\pi ],}$ so that in the limit ${\displaystyle S[{\mathcal {C}}]/\hbar \rightarrow \infty }$ there are no significant contributions to ${\displaystyle I({\mathcal {T}}).}$

For a free particle, as you will remember, ${\displaystyle S[{\mathcal {C}}]=-m\,c^{2}\,s[{\mathcal {C}}].}$ From this we gather that the likelihood of finding a freely moving object where according to the laws of classical physics it should not be, decreases as its mass increases. Since for sufficiently massive objects the contributions to the action due to influences on their motion are small compared to ${\displaystyle |-m\,c^{2}\,s[{\mathcal {C}}]|,}$ this is equally true of objects that are not moving freely.

What, then, are the laws of classical physics?

They are what the laws of quantum physics degenerate into in the limit ${\displaystyle \hbar \rightarrow 0.}$ In this limit, as you will gather from the above, the probability of finding that a particle has traveled within a tube (however narrow) containing a geodesic, is 1, and the probability of finding that a particle has traveled within a tube (however wide) not containing a geodesic, is 0. Thus we may state the laws of classical physics (for a single "point mass", to begin with) by saying that it follows a geodesic of the geometry defined by ${\displaystyle dS.}$

This is readily generalized. The propagator for a system with ${\displaystyle n}$ degrees of freedom — such as an ${\displaystyle m}$-particle system with ${\displaystyle n=3m}$ degrees of freedom — is

${\displaystyle \langle {\mathcal {P}}_{f},t_{f}|{\mathcal {P}}_{i},t_{i}\rangle =\int \!{\mathcal {DC}}\,e^{(i/\hbar )S[{\mathcal {C}}]},}$

where ${\displaystyle {\mathcal {P}}_{i}}$ and ${\displaystyle {\mathcal {P}}_{f}}$ are the system's respective configurations at the initial time ${\displaystyle t_{i}}$ and the final time ${\displaystyle t_{f},}$ and the integral sums over all paths in the system's ${\displaystyle n{+}1}$-dimensional configuration spacetime leading from ${\displaystyle ({\mathcal {P}}_{i},t_{i})}$ to ${\displaystyle ({\mathcal {P}}_{f},t_{f}).}$ In this case, too, the corresponding classical system follows a geodesic of the geometry defined by the action differential ${\displaystyle dS,}$ which now depends on ${\displaystyle n}$ spatial coordinates, one time coordinate, and the corresponding ${\displaystyle n{+}1}$ differentials.

The statement that a classical system follows a geodesic of the geometry defined by its action, is often referred to as the principle of least action. A more appropriate name is principle of stationary action.

### Energy and momentum

Observe that if ${\displaystyle dS}$ does not depend on ${\displaystyle t}$ (that is, ${\displaystyle \partial dS/\partial t=0}$ ) then

${\displaystyle E=-{\partial dS \over \partial dt}}$

is constant along geodesics. (We'll discover the reason for the negative sign in a moment.)

Likewise, if ${\displaystyle dS}$ does not depend on ${\displaystyle \mathbf {r} }$ (that is, ${\displaystyle \partial dS/\partial \mathbf {r} =0}$ ) then

${\displaystyle \mathbf {p} ={\partial dS \over \partial d\mathbf {r} }}$

is constant along geodesics.

${\displaystyle E}$ tells us how much the projection ${\displaystyle dt}$ of a segment ${\displaystyle d{\mathcal {C}}}$ of a path ${\displaystyle {\mathcal {C}}}$ onto the time axis contributes to the action of ${\displaystyle {\mathcal {C}}.}$ ${\displaystyle \mathbf {p} }$ tells us how much the projection ${\displaystyle d\mathbf {r} }$ of ${\displaystyle d{\mathcal {C}}}$ onto space contributes to ${\displaystyle S[{\mathcal {C}}].}$ If ${\displaystyle dS}$ has no explicit time dependence, then equal intervals of the time axis make equal contributions to ${\displaystyle S[{\mathcal {C}}],}$ and if ${\displaystyle dS}$ has no explicit space dependence, then equal intervals of any spatial axis make equal contributions to ${\displaystyle S[{\mathcal {C}}].}$ In the former case, equal time intervals are physically equivalent: they represent equal durations. In the latter case, equal space intervals are physically equivalent: they represent equal distances.

If equal intervals of the time coordinate or equal intervals of a space coordinate are not physically equivalent, this is so for either of two reasons. The first is that non-inertial coordinates are used. For if inertial coordinates are used, then every freely moving point mass moves by equal intervals of the space coordinates in equal intervals of the time coordinate, which means that equal coordinate intervals are physically equivalent. The second is that whatever it is that is moving is not moving freely: something, no matter what, influences its motion, no matter how. This is because one way of incorporating effects on the motion of an object into the mathematical formalism of quantum physics, is to make inertial coordinate intervals physically inequivalent, by letting ${\displaystyle dS}$ depend on ${\displaystyle t}$ and/or ${\displaystyle \mathbf {r} .}$

Thus for a freely moving classical object, both ${\displaystyle E}$ and ${\displaystyle \mathbf {p} }$ are constant. Since the constancy of ${\displaystyle E}$ follows from the physical equivalence of equal intervals of coordinate time (a.k.a. the "homogeneity" of time), and since (classically) energy is defined as the quantity whose constancy is implied by the homogeneity of time, ${\displaystyle E}$ is the object's energy.

By the same token, since the constancy of ${\displaystyle \mathbf {p} }$ follows from the physical equivalence of equal intervals of any spatial coordinate axis (a.k.a. the "homogeneity" of space), and since (classically) momentum is defined as the quantity whose constancy is implied by the homogeneity of space, ${\displaystyle \mathbf {p} }$ is the object's momentum.

Let us differentiate a former result,

${\displaystyle dS(t,\mathbf {r} ,\lambda \,dt,\lambda \,d\mathbf {r} )=\lambda \,dS(t,\mathbf {r} ,dt,d\mathbf {r} ),}$

with respect to ${\displaystyle \lambda .}$ The left-hand side becomes

${\displaystyle {d(dS) \over d\lambda }={\partial dS \over \partial (\lambda dt)}{\partial (\lambda dt) \over \partial \lambda }+{\partial dS \over \partial (\lambda d\mathbf {r} )}\cdot {\partial (\lambda d\mathbf {r} ) \over \partial \lambda }={\partial dS \over \partial (\lambda dt)}dt+{\partial dS \over \partial (\lambda d\mathbf {r} )}\cdot d\mathbf {r} ,}$

while the right-hand side becomes just ${\displaystyle dS.}$ Setting ${\displaystyle \lambda =1}$ and using the above definitions of ${\displaystyle E}$ and ${\displaystyle \mathbf {p} ,}$ we obtain

 ${\displaystyle -E\,dt+\mathbf {p} \cdot d\mathbf {r} =dS.}$

${\displaystyle dS=-m\,c^{2}\,ds}$ is a 4-scalar. Since ${\displaystyle (c\,dt,d\mathbf {r} )}$ are the components of a 4-vector, the left-hand side, ${\displaystyle -E\,dt+\mathbf {p} \cdot d\mathbf {r} ,}$ is a 4-scalar if and only if ${\displaystyle (E/c,\mathbf {p} )}$ are the components of another 4-vector.

(If we had defined ${\displaystyle E}$ without the minus, this 4-vector would have the components ${\displaystyle (-E/c,\mathbf {p} ).}$)

In the rest frame ${\displaystyle {\mathcal {F}}'}$ of a free point mass, ${\displaystyle dt'=ds}$ and ${\displaystyle dS=-m\,c^{2}\,dt'.}$ Using the Lorentz transformations, we find that this equals

${\displaystyle dS=-mc^{2}{dt-v\,dx/c^{2} \over {\sqrt {1-v^{2}/c^{2}}}}=-{mc^{2} \over {\sqrt {1-v^{2}/c^{2}}}}\,dt+{m\mathbf {v} \over {\sqrt {1-v^{2}/c^{2}}}}\cdot d\mathbf {r} ,}$

where ${\displaystyle \mathbf {v} =(v,0,0)}$ is the velocity of the point mass in ${\displaystyle {\mathcal {F}}.}$ Compare with the above framed equation to find that for a free point mass,

${\displaystyle E={mc^{2} \over {\sqrt {1-v^{2}/c^{2}}}}\qquad \mathbf {p} ={m\mathbf {v} \over {\sqrt {1-v^{2}/c^{2}}}}\;.}$

### Lorentz force law

To incorporate effects on the motion of a particle (regardless of their causes), we must modify the action differential ${\displaystyle dS=-mc^{2}\,dt{\sqrt {1-v^{2}/c^{2}}}}$ that a free particle associates with a path segment ${\displaystyle d{\mathcal {C}}.}$ In doing so we must take care that the modified ${\displaystyle dS}$ (i) remains homogeneous in the differentials and (ii) remains a 4-scalar. The most straightforward way to do this is to add a term that is not just homogeneous but linear in the coordinate differentials:

${\displaystyle (^{*})\quad dS=-mc^{2}\,dt{\sqrt {1-v^{2}/c^{2}}}-qV(t,\mathbf {r} )\,dt+(q/c)\mathbf {A} (t,\mathbf {r} )\cdot d\mathbf {r} .}$

Believe it or not, all classical electromagnetic effects (as against their causes) are accounted for by this expression. ${\displaystyle V(t,\mathbf {r} )}$ is a scalar field (that is, a function of time and space coordinates that is invariant under rotations of the space coordinates), ${\displaystyle \mathbf {A} (t,\mathbf {r} )}$ is a 3-vector field, and ${\displaystyle (V,\mathbf {A} )}$ is a 4-vector field. We call ${\displaystyle V}$ and ${\displaystyle \mathbf {A} }$ the scalar potential and the vector potential, respectively. The particle-specific constant ${\displaystyle q}$ is the electric charge, which determines how strongly a particle of a given species is affected by influences of the electromagnetic kind.

If a point mass is not free, the expressions at the end of the previous section give its kinetic energy ${\displaystyle E_{k}}$ and its kinetic momentum ${\displaystyle \mathbf {p} _{k}.}$ Casting (*) into the form

${\displaystyle dS=-(E_{k}+qV)\,dt+[\mathbf {p} _{k}+(q/c)\mathbf {A} ]\cdot d\mathbf {r} }$

and plugging it into the definitions

${\displaystyle (^{*}{}^{*})\quad E=-{\partial dS \over \partial dt},\qquad \mathbf {p} ={\partial dS \over \partial d\mathbf {r} },}$

we obtain

${\displaystyle E=E_{k}+qV,\qquad \mathbf {p} =\mathbf {p} _{k}+(q/c)\mathbf {A} .}$

${\displaystyle qV}$ and ${\displaystyle (q/c)\mathbf {A} }$ are the particle's potential energy and potential momentum, respectively.

Now we plug (**) into the geodesic equation

${\displaystyle {\partial dS \over \partial \mathbf {r} }=d\,{\partial dS \over \partial d\mathbf {r} }.}$

For the right-hand side we obtain

${\displaystyle d\mathbf {p} _{k}+{q \over c}d\mathbf {A} =d\mathbf {p} _{k}+{q \over c}\left[dt{\partial \mathbf {A} \over \partial t}+\left(d\mathbf {r} \cdot {\partial \over \partial \mathbf {r} }\right)\mathbf {A} \right],}$

while the left-hand side works out at

${\displaystyle -q{\partial V \over \partial \mathbf {r} }dt+{q \over c}{\partial (\mathbf {A} \cdot d\mathbf {r} ) \over \partial \mathbf {r} }=-q{\partial V \over \partial \mathbf {r} }dt+{q \over c}\left[\left(d\mathbf {r} \cdot {\partial \over \partial \mathbf {r} }\right)\mathbf {A} +d\mathbf {r} \times \left({\partial \over \partial \mathbf {r} }\times \mathbf {A} \right)\right].}$

Two terms cancel out, and the final result is

${\displaystyle d\mathbf {p} _{k}=q\underbrace {\left(-{\partial V \over \partial \mathbf {r} }-{1 \over c}{\partial \mathbf {A} \over \partial t}\right)} _{\displaystyle \equiv \mathbf {E} }dt+d\mathbf {r} \times {q \over c}\underbrace {\left({\partial \over \partial \mathbf {r} }\times \mathbf {A} \right)} _{\displaystyle \equiv \mathbf {B} }=q\,\mathbf {E} \,dt+d\mathbf {r} \times {q \over c}\,\mathbf {B} .}$

As a classical object travels along the segment ${\displaystyle d{\mathcal {G}}}$ of a geodesic, its kinetic momentum changes by the sum of two terms, one linear in the temporal component ${\displaystyle dt}$ of ${\displaystyle d{\mathcal {G}}}$ and one linear in the spatial component ${\displaystyle d\mathbf {r} .}$ How much ${\displaystyle dt}$ contributes to the change of ${\displaystyle \mathbf {p} _{k}}$ depends on the electric field ${\displaystyle \mathbf {E} ,}$ and how much ${\displaystyle d\mathbf {r} }$ contributes depends on the magnetic field ${\displaystyle \mathbf {B} .}$ The last equation is usually written in the form

${\displaystyle {d\mathbf {p} _{k} \over dt}=q\,\mathbf {E} +{q \over c}\,\mathbf {v} \times \mathbf {B} ,}$

called the Lorentz force law, and accompanied by the following story: there is a physical entity known as the electromagnetic field, which is present everywhere, and which exerts on a charge ${\displaystyle q}$ an electric force ${\displaystyle q\mathbf {E} }$ and a magnetic force ${\displaystyle (q/c)\,\mathbf {v} \times \mathbf {B} .}$

(Note: This form of the Lorentz force law holds in the Gaussian system of units. In the MKSA system of units the ${\displaystyle c}$ is missing.)

### Whence the classical story?

Imagine a small rectangle in spacetime with corners

${\displaystyle A=(0,0,0,0),\;B=(dt,0,0,0),\;C=(0,dx,0,0),\;D=(dt,dx,0,0).}$

Let's calculate the electromagnetic contribution to the action of the path from ${\displaystyle A}$ to ${\displaystyle D}$ via ${\displaystyle B}$ for a unit charge (${\displaystyle q=1}$) in natural units ( ${\displaystyle c=1}$ ):

${\displaystyle S_{ABD}=-V(dt/2,0,0,0)\,dt+A_{x}(dt,dx/2,0,0)\,dx}$
${\displaystyle \quad =-V(dt/2,0,0,0)\,dt+\left[A_{x}(0,dx/2,0,0)+{\partial A_{x} \over \partial t}dt\right]dx.}$

Next, the contribution to the action of the path from ${\displaystyle A}$ to ${\displaystyle D}$ via ${\displaystyle C}$:

${\displaystyle S_{ACD}=A_{x}(0,dx/2,0,0)\,dx-V(dt/2,dx,0,0)\,dt}$
${\displaystyle =A_{x}(0,dx/2,0,0)\,dx-\left[V(dt/2,0,0,0)+{\partial V \over \partial x}dx\right]dt.}$

Look at the difference:

${\displaystyle \Delta S=S_{ACD}-S_{ABD}=\left(-{\partial V \over \partial x}-{\partial A_{x} \over \partial t}\right)dt\,dx=E_{x}\,dt\,dx.}$

Alternatively, you may think of ${\displaystyle \Delta S}$ as the electromagnetic contribution to the action of the loop ${\displaystyle A\rightarrow B\rightarrow D\rightarrow C\rightarrow A.}$

Let's repeat the calculation for a small rectangle with corners

${\displaystyle A=(0,0,0,0),\;B=(0,0,dy,0),\;C=(0,0,0,dz),\;D=(0,0,dy,dz).}$

${\displaystyle S_{ABD}=A_{z}(0,0,0,dz/2)\,dz+A_{y}(0,0,dy/2,dz)\,dy}$
${\displaystyle =A_{z}(0,0,0,dz/2)\,dz+\left[A_{y}(0,0,dy/2,0)+{\partial A_{y} \over \partial z}dz\right]dy,}$
${\displaystyle S_{ACD}=A_{y}(0,0,dy/2,0)\,dy+A_{z}(0,0,dy,dz/2)\,dz}$
${\displaystyle =A_{y}(0,0,dy/2,0)\,dy+\left[A_{z}(0,0,0,dz/2)+{\partial A_{z} \over \partial y}dy\right]dz,}$
${\displaystyle \Delta S=S_{ACD}-S_{ABD}=\left({\partial A_{z} \over \partial y}-{\partial A_{y} \over \partial z}\right)dy\,dz=B_{x}\,dy\,dz.}$

Thus the electromagnetic contribution to the action of this loop equals the flux of ${\displaystyle \mathbf {B} }$ through the loop.

Remembering (i) Stokes' theorem and (ii) the definition of ${\displaystyle \mathbf {B} }$ in terms of ${\displaystyle \mathbf {A} ,}$ we find that

${\displaystyle \oint _{\partial \Sigma }\mathbf {A} \cdot d\mathbf {r} =\int _{\Sigma }{\hbox{curl}}\,\mathbf {A} \cdot d\mathbf {\Sigma } =\int _{\Sigma }\mathbf {B} \cdot d\mathbf {\Sigma } .}$

In (other) words, the magnetic flux through a loop ${\displaystyle \partial \Sigma }$ (or through any surface ${\displaystyle \Sigma }$ bounded by ${\displaystyle \partial \Sigma }$ ) equals the circulation of ${\displaystyle \mathbf {A} }$ around the loop (or around any surface bounded by the loop).

The effect of a circulation ${\displaystyle \oint _{\partial \Sigma }\mathbf {A} \cdot d\mathbf {r} }$ around the finite rectangle ${\displaystyle A\rightarrow B\rightarrow D\rightarrow C\rightarrow A}$ is to increase (or decrease) the action associated with the segment ${\displaystyle A\rightarrow B\rightarrow D}$ relative to the action associated with the segment ${\displaystyle A\rightarrow C\rightarrow D.}$ If the actions of the two segments are equal, then we can expect the path of least action from ${\displaystyle A}$ to ${\displaystyle D}$ to be a straight line. If one segment has a greater action than the other, then we can expect the path of least action from ${\displaystyle A}$ to ${\displaystyle D}$ to curve away from the segment with the larger action.

Compare this with the classical story, which explains the curvature of the path of a charged particle in a magnetic field by invoking a force that acts at right angles to both the magnetic field and the particle's direction of motion. The quantum-mechanical treatment of the same effect offers no such explanation. Quantum mechanics invokes no mechanism of any kind. It simply tells us that for a sufficiently massive charge traveling from ${\displaystyle A}$ to ${\displaystyle D,}$ the probability of finding that it has done so within any bundle of paths not containing the action-geodesic connecting ${\displaystyle A}$ with ${\displaystyle D,}$ is virtually 0.

Much the same goes for the classical story according to which the curvature of the path of a charged particle in a spacetime plane is due to a force that acts in the direction of the electric field. (Observe that curvature in a spacetime plane is equivalent to acceleration or deceleration. In particular, curvature in a spacetime plane containing the ${\displaystyle x}$ axis is equivalent to acceleration in a direction parallel to the ${\displaystyle x}$ axis.) In this case the corresponding circulation is that of the 4-vector potential ${\displaystyle (cV,\mathbf {A} )}$ around a spacetime loop.

## Schrödinger at last

The Schrödinger equation is non-relativistic. We obtain the non-relativistic version of the electromagnetic action differential,

${\displaystyle dS=-mc^{2}\,dt{\sqrt {1-v^{2}/c^{2}}}-qV(t,\mathbf {r} )\,dt+(q/c)\mathbf {A} (t,\mathbf {r} )\cdot d\mathbf {r} ,}$

by expanding the root and ignoring all but the first two terms:

${\displaystyle {\sqrt {1-v^{2}/c^{2}}}=1-{1 \over 2}{v^{2} \over c^{2}}-{1 \over 8}{v^{4} \over c^{4}}-\cdots \approx 1-{1 \over 2}{v^{2} \over c^{2}}.}$

This is obviously justified if ${\displaystyle v\ll c,}$ which defines the non-relativistic regime.

Writing the potential part of ${\displaystyle dS}$ as ${\displaystyle q\,[-V+\mathbf {A} (t,\mathbf {r} )\cdot (\mathbf {v} /c)]\,dt}$ makes it clear that in most non-relativistic situations the effects represented by the vector potential ${\displaystyle \mathbf {A} }$ are small compared to those represented by the scalar potential ${\displaystyle V.}$ If we ignore them (or assume that ${\displaystyle \mathbf {A} }$ vanishes), and if we include the charge ${\displaystyle q}$ in the definition of ${\displaystyle V}$ (or assume that ${\displaystyle q=1}$), we obtain

${\displaystyle S[{\mathcal {C}}]=-mc^{2}(t_{B}-t_{A})+\int _{\mathcal {C}}dt\left[{\textstyle {m \over 2}}v^{2}-V(t,\mathbf {r} )\right]}$

for the action associated with a spacetime path ${\displaystyle {\mathcal {C}}.}$

Because the first term is the same for all paths from ${\displaystyle A}$ to ${\displaystyle B,}$ it has no effect on the differences between the phases of the amplitudes associated with different paths. By dropping it we change neither the classical phenomena (inasmuch as the extremal path remains the same) nor the quantum phenomena (inasmuch as interference effects only depend on those differences). Thus

${\displaystyle \langle B|A\rangle =\int {\mathcal {DC}}e^{(i/\hbar )\int _{\mathcal {C}}dt[(m/2)v^{2}-V]}.}$

We now introduce the so-called wave function ${\displaystyle \psi (t,\mathbf {r} )}$ as the amplitude of finding our particle at ${\displaystyle \mathbf {r} }$ if the appropriate measurement is made at time ${\displaystyle t.}$ ${\displaystyle \langle t,\mathbf {r} |t',\mathbf {r} '\rangle \,\psi (t',\mathbf {r} '),}$ accordingly, is the amplitude of finding the particle first at ${\displaystyle \mathbf {r} '}$ (at time ${\displaystyle t'}$) and then at ${\displaystyle \mathbf {r} }$ (at time ${\displaystyle t}$). Integrating over ${\displaystyle \mathbf {r} ,}$ we obtain the amplitude of finding the particle at ${\displaystyle \mathbf {r} }$ (at time ${\displaystyle t}$), provided that Rule B applies. The wave function thus satisfies the equation

${\displaystyle \psi (t,\mathbf {r} )=\int \!d^{3}r'\,\langle t,\mathbf {r} |t',\mathbf {r} '\rangle \,\psi (t',\mathbf {r} ').}$

We again simplify our task by pretending that space is one-dimensional. We further assume that ${\displaystyle t}$ and ${\displaystyle t'}$ differ by an infinitesimal interval ${\displaystyle \epsilon .}$ Since ${\displaystyle \epsilon }$ is infinitesimal, there is only one path leading from ${\displaystyle x'}$ to ${\displaystyle x.}$ We can therefore forget about the path integral except for a normalization factor ${\displaystyle {\mathcal {A}}}$ implicit in the integration measure ${\displaystyle {\mathcal {DC}},}$ and make the following substitutions:

${\displaystyle dt=\epsilon ,\quad v={\frac {x-x'}{\epsilon }},\quad V=V\left(t{+}{\frac {\epsilon }{2}},{\frac {x{+}x'}{2}}\right).}$

This gives us

${\displaystyle \psi (t{+}\epsilon ,x)={\mathcal {A}}\int \!dx'\,e^{im(x{-}x')^{2}/2\hbar \epsilon }\,e^{-(i\epsilon /\hbar )V(t{+}\epsilon /2,(x{+}x')/2)}\,\psi (t,x').}$

We obtain a further simplification if we introduce ${\displaystyle \eta =x'-x}$ and integrate over ${\displaystyle \eta }$ instead of ${\displaystyle x'.}$ (The integration "boundaries" ${\displaystyle -\infty }$ and ${\displaystyle +\infty }$ are the same for both ${\displaystyle x'}$ and ${\displaystyle \eta .}$) We now have that

${\displaystyle \psi (t+\epsilon ,x)={\mathcal {A}}\int \!d\eta \,e^{im\eta ^{2}/2\hbar \epsilon }\,e^{-(i\epsilon /\hbar )V(t{+}\epsilon /2,x{+}\eta /2)}\,\psi (t,x{+}\eta ).}$

Since we are interested in the limit ${\displaystyle \epsilon \rightarrow 0,}$ we expand all terms to first order in ${\displaystyle \epsilon .}$ To which power in ${\displaystyle \eta }$ should we expand? As ${\displaystyle \eta }$ increases, the phase ${\displaystyle m\eta ^{2}/2\hbar \epsilon }$ increases at an infinite rate (in the limit ${\displaystyle \epsilon \rightarrow 0}$) unless ${\displaystyle \eta ^{2}}$ is of the same order as ${\displaystyle \epsilon .}$ In this limit, higher-order contributions to the integral cancel out. Thus the left-hand side expands to

${\displaystyle \psi (t+\epsilon ,x)\approx \psi (t,x)+{\partial \psi \over \partial t}\epsilon ,}$

while ${\displaystyle e^{-(i\epsilon /\hbar )V(t{+}\epsilon /2,x{+}\eta /2)}\,\psi (t,x{+}\eta )}$ expands to

${\displaystyle \left[1-{i\epsilon \over \hbar }V(t,x)\right]\left[\psi (t,x)+{\partial \psi \over \partial x}\eta +{\frac {1}{2}}{\partial ^{2}\psi \over \partial x^{2}}\eta ^{2}\right]=\left[1-{i\epsilon \over \hbar }V(t,x)\right]\!\psi (t,x)+{\partial \psi \over \partial x}\eta +{\partial ^{2}\psi \over \partial x^{2}}{\eta ^{2} \over 2}.}$

The following integrals need to be evaluated:

${\displaystyle I_{1}=\int \!d\eta \,e^{im\eta ^{2}/2\hbar \epsilon },\quad I_{2}=\int \!d\eta \,e^{im\eta ^{2}/2\hbar \epsilon }\eta ,\quad I_{3}=\int \!d\eta \,e^{im\eta ^{2}/2\hbar \epsilon }\eta ^{2}.}$

The results are

${\displaystyle I_{1}={\sqrt {2\pi i\hbar \epsilon /m}},\quad I_{2}=0,\quad I_{3}={\sqrt {2\pi \hbar ^{3}\epsilon ^{3}/im^{3}}}.}$

Putting Humpty Dumpty back together again yields

${\displaystyle \psi (t,x)+{\partial \psi \over \partial t}\epsilon ={\mathcal {A}}{\sqrt {2\pi i\hbar \epsilon \over m}}\left(1-{i\epsilon \over \hbar }V(t,x)\right)\psi (t,x)+{{\mathcal {A}} \over 2}{\sqrt {2\pi \hbar ^{3}\epsilon ^{3} \over im^{3}}}{\partial ^{2}\psi \over \partial x^{2}}.}$

The factor of ${\displaystyle \psi (t,x)}$ must be the same on both sides, so ${\displaystyle {\mathcal {A}}={\sqrt {m/2\pi i\hbar \epsilon }},}$ which reduces Humpty Dumpty to

${\displaystyle {\partial \psi \over \partial t}\epsilon =-{i\epsilon \over \hbar }V\psi +{i\hbar \epsilon \over 2m}{\partial ^{2}\psi \over \partial x^{2}}.}$

Multiplying by ${\displaystyle i\hbar /\epsilon }$ and taking the limit ${\displaystyle \epsilon \rightarrow 0}$ (which is trivial since ${\displaystyle \epsilon }$ has dropped out), we arrive at the Schrödinger equation for a particle with one degree of freedom subject to a potential ${\displaystyle V(t,x)}$:

${\displaystyle i\hbar {\partial \psi \over \partial t}=-{\hbar ^{2} \over 2m}{\partial ^{2}\psi \over \partial x^{2}}+V\psi .}$

Trumpets please! The transition to three dimensions is straightforward:

 ${\displaystyle i\hbar {\partial \psi \over \partial t}=-{\hbar ^{2} \over 2m}\left({\partial ^{2}\psi \over \partial x^{2}}+{\partial ^{2}\psi \over \partial y^{2}}+{\partial ^{2}\psi \over \partial z^{2}}\right)+V\psi .}$

# The Schrödinger equation: implications and applications

In this chapter we take a look at some of the implications of the Schrödinger equation

${\displaystyle i\hbar \,{\frac {\partial \psi }{\partial t}}={\frac {1}{2m}}\left({\frac {\hbar }{i}}{\frac {\partial }{\partial \mathbf {r} }}-\mathbf {A} \right)^{2}\psi +V\psi .}$

## How fuzzy positions get fuzzier

We will calculate the rate at which the fuzziness of a position probability distribution increases, in consequence of the fuzziness of the corresponding momentum, when there is no counterbalancing attraction (like that between the nucleus and the electron in atomic hydrogen).

Because it is easy to handle, we choose a Gaussian function

${\displaystyle \psi (0,x)=Ne^{-x^{2}/2\sigma ^{2}},}$

which has a bell-shaped graph. It defines a position probability distribution

${\displaystyle |\psi (0,x)|^{2}=N^{2}e^{-x^{2}/\sigma ^{2}}.}$

If we normalize this distribution so that ${\displaystyle \int dx\,|\psi (0,x)|^{2}=1,}$ then ${\displaystyle N^{2}=1/\sigma {\sqrt {\pi }},}$ and

${\displaystyle |\psi (0,x)|^{2}=e^{-x^{2}/\sigma ^{2}}/\sigma {\sqrt {\pi }}.}$

We also have that

• ${\displaystyle \Delta x(0)=\sigma /{\sqrt {2}},}$
• the Fourier transform of ${\displaystyle \psi (0,x)}$ is ${\displaystyle {\overline {\psi }}(0,k)={\sqrt {\sigma /{\sqrt {\pi }}}}e^{-\sigma ^{2}k^{2}/2},}$
• this defines the momentum probability distribution ${\displaystyle |{\overline {\psi }}(0,k)|^{2}=\sigma e^{-\sigma ^{2}k^{2}}/{\sqrt {\pi }},}$
• and ${\displaystyle \Delta k(0)=1/\sigma {\sqrt {2}}.}$

The fuzziness of the position and of the momentum of a particle associated with ${\displaystyle \psi (0,x)}$ is therefore the minimum allowed by the "uncertainty" relation: ${\displaystyle \Delta x(0)\,\Delta k(0)=1/2.}$

Now recall that

${\displaystyle {\overline {\psi }}(t,k)=\phi (0,k)e^{-i\omega t},}$

where ${\displaystyle \omega =\hbar k^{2}/2m.}$ This has the Fourier transform

${\displaystyle \psi (t,x)={\sqrt {\sigma \over {\sqrt {\pi }}}}{1 \over {\sqrt {\sigma ^{2}+i\,(\hbar /m)\,t}}}\,e^{-x^{2}/2[\sigma ^{2}+i\,(\hbar /m)\,t]},}$

and this defines the position probability distribution

${\displaystyle |\psi (t,x)|^{2}={1 \over {\sqrt {\pi }}{\sqrt {\sigma ^{2}+(\hbar ^{2}/m^{2}\sigma ^{2})\,t^{2}}}}\,e^{-x^{2}/[\sigma ^{2}+(\hbar ^{2}/m^{2}\sigma ^{2})\,t^{2}]}.}$

Comparison with ${\displaystyle |\psi (0,x)|^{2}}$ reveals that ${\displaystyle \sigma (t)={\sqrt {\sigma ^{2}+(\hbar ^{2}/m^{2}\sigma ^{2})\,t^{2}}}.}$ Therefore,

${\displaystyle \Delta x(t)={\sigma (t) \over {\sqrt {2}}}={\sqrt {{\sigma ^{2} \over 2}+{\hbar ^{2}t^{2} \over 2m^{2}\sigma ^{2}}}}={\sqrt {[\Delta x(0)]^{2}+{\hbar ^{2}t^{2} \over 4m^{2}[\Delta x(0)]^{2}}}}.}$

The graphs below illustrate how rapidly the fuzziness of a particle the mass of an electron grows, when compared to an object the mass of a ${\displaystyle C_{60}}$ molecule or a peanut. Here we see one reason, though by no means the only one, why for all intents and purposes "once sharp, always sharp" is true of the positions of macroscopic objects.

Above: an electron with ${\displaystyle \Delta x(0)=1}$ nanometer. In a second, ${\displaystyle \Delta x(t)}$ grows to nearly 60 km.

Below: an electron with ${\displaystyle \Delta x(0)=1}$ centimeter. ${\displaystyle \Delta x(t)}$ grows only 16% in a second.

Next, a ${\displaystyle C_{60}}$ molecule with ${\displaystyle \Delta x(0)=1}$ nanometer. In a second, ${\displaystyle \Delta x(t)}$ grows to 4.4 centimeters.

Finally, a peanut (2.8 g) with ${\displaystyle \Delta x(0)=1}$ nanometer. ${\displaystyle \Delta x(t)}$ takes the present age of the universe to grow to 7.5 micrometers.

## Time-independent Schrödinger equation

If the potential V does not depend on time, then the Schrödinger equation has solutions that are products of a time-independent function ${\displaystyle \psi (\mathbf {r} )}$ and a time-dependent phase factor ${\displaystyle e^{-(i/\hbar )\,E\,t}}$:

${\displaystyle \psi (t,\mathbf {r} )=\psi (\mathbf {r} )\,e^{-(i/\hbar )\,E\,t}.}$

Because the probability density ${\displaystyle |\psi (t,\mathbf {r} )|^{2}}$ is independent of time, these solutions are called stationary.

Plug ${\displaystyle \psi (\mathbf {r} )\,e^{-(i/\hbar )\,E\,t}}$ into

${\displaystyle i\hbar {\frac {\partial \psi }{\partial t}}=-{\frac {\hbar ^{2}}{2m}}{\frac {\partial }{\partial \mathbf {r} }}\cdot {\frac {\partial }{\partial \mathbf {r} }}\psi +V\psi }$

to find that ${\displaystyle \psi (\mathbf {r} )}$ satisfies the time-independent Schrödinger equation

${\displaystyle E\psi (\mathbf {r} )=-{\hbar ^{2} \over 2m}\left({\frac {\partial ^{2}}{\partial x^{2}}}+{\frac {\partial ^{2}}{\partial y^{2}}}+{\frac {\partial ^{2}}{\partial z^{2}}}\right)\psi (\mathbf {r} )+V(\mathbf {r} )\,\psi (\mathbf {r} ).}$

## Why energy is quantized

Limiting ourselves again to one spatial dimension, we write the time independent Schrödinger equation in this form:

${\displaystyle {d^{2}\psi (x) \over dx^{2}}=A(x)\,\psi (x),\qquad A(x)={2m \over \hbar ^{2}}{\Big [}V(x)-E{\Big ]}.}$