# Classical Mechanics/Lagrange Theory

Jump to navigation Jump to search

This section contains several theoretical developments of the Lagrangian formalism that are not directly necessary for solving problems. However, these considerations help understand the theory more deeply and answer certain important questions.

## Why does the extremum of a functional determine motion?

In the Lagrangian formulation of mechanics, the trajectory ${\displaystyle {\vec {q}}(t)}$ is determined from the condition that the action functional ${\displaystyle S[{\vec {q}}(t)]}$ should have an extremum. (It is not always the case that the trajectory is the minimum of the action; in some cases it might be merely an extremum, i.e. a point where the functional derivative ${\displaystyle \delta S/\delta {\vec {q}}(t)}$ vanishes.) This condition is known as the action principle. By now, you should be familiar with the mathematical procedures used to derive the equations of motion from the action principle.

So, at this point, you should be well used to the fact that the correct equations of motion for each mechanical system indeed follow from the action principle, if the Lagrangian is chosen appropriately. However, it might still feel like a mystery to you that Newton's laws are equivalent to the condition for the extremum of some functional. You might be asking yourself: why is this possible at all?

Here is one explanation that may help. Let us consider a simple mechanical system: a point mass ${\displaystyle m}$ moving in one dimension, with coordinate ${\displaystyle x(t)}$, in a potential ${\displaystyle U(x)}$. (The same considerations can be easily generalized to the case of more than one dimensions and more than one coordinate.) Suppose that ${\displaystyle x_{0}(t)}$ is the correct trajectory according to Newton's law,

${\displaystyle m{\ddot {x}}_{0}(t)=-\left.{\frac {dU}{dx}}\right|_{x=x_{0}(t)}.}$

How can we use a functional ${\displaystyle S[x]}$ to express the condition that the trajectory ${\displaystyle x(t)}$ is the correct one? One way is to demand that the deviation of ${\displaystyle x(t)}$ from ${\displaystyle x_{0}(t)}$ is everywhere zero. This can be expressed using the functional

${\displaystyle S_{1}[x]=\int _{t_{1}}^{t_{2}}[x(t)-x_{0}(t)]^{2}dt.}$

It is clear that the functional ${\displaystyle S_{1}[x]}$ has the minimum value (obviously the minimum is 0) if and only if ${\displaystyle x(t)=x_{0}(t)}$ for all ${\displaystyle t}$. This is an example of how to use a functional to express some condition on functions: the functional ${\displaystyle S_{1}[x]}$ measures the deviation of ${\displaystyle x(t)}$ from ${\displaystyle x_{0}(t)}$ all along the way. The smallest possible deviation is no deviation at all; thus, the minimum of the functional ${\displaystyle S_{1}[x(t)]}$ is at the trajectory ${\displaystyle x(t)}$ that does not deviate at all from ${\displaystyle x_{0}(t)}$.

Another similar way to specify the trajectory is to use the functional

${\displaystyle S_{2}[x]=\int _{t_{1}}^{t_{2}}[{\dot {x}}(t)-{\dot {x}}_{0}(t)]^{2}dt.}$

This functional, together with the boundary conditions ${\displaystyle x(t_{1})=x_{0}(t_{1}),x(t_{2})=x_{0}(t_{2})}$, has the minimum value if and only if ${\displaystyle x(t)=x_{0}(t)}$ for all ${\displaystyle t}$.

Admittedly, the functionals ${\displaystyle S_{1}[x],S_{2}(x)}$ do not help us to formulate the laws of mechanics, because they already contain the correct trajectory ${\displaystyle x_{0}(t)}$ explicitly. We shall now construct another functional, ${\displaystyle S_{3}[x]}$, starting from ${\displaystyle S_{2}[x]}$ and trying to eliminate the explicit dependence on ${\displaystyle x_{0}(t)}$.

Let us rewrite ${\displaystyle S_{2}[x]}$ as

${\displaystyle S_{2}[x]=\int _{t_{1}}^{t_{2}}[{\dot {x}}^{2}-2{\dot {x}}{\dot {x}}_{0}+{\dot {x}}_{0}^{2}]dt.}$

The third term, ${\displaystyle {\dot {x}}_{0}^{2}}$, is a fixed function and does not vary when we vary ${\displaystyle x(t)}$. Therefore we may omit that term from ${\displaystyle S_{2}}$. Furthermore, we would like to have ${\displaystyle {\ddot {x}}_{0}}$ rather than ${\displaystyle {\dot {x}}_{0}}$, since we could then use Newton's law for the correct trajectory. So let us integrate the second term by parts:

${\displaystyle -2\int _{t_{1}}^{t_{2}}{\dot {x}}{\dot {x}}_{0}dt=-2\left.x{\dot {x}}_{0}\right|_{t_{1}}^{t_{2}}+\int _{t_{1}}^{t_{2}}2x{\ddot {x}}_{0}dt.}$

The boundary term ${\displaystyle \left.x{\dot {x}}_{0}\right|_{t_{1}}^{t_{2}}}$ does not vary with ${\displaystyle x(t)}$ since the boundary values of ${\displaystyle x(t)}$ are fixed. Therefore we may omit that term. Finally, we use Newton's law to replace ${\displaystyle {\ddot {x}}_{0}}$ by ${\displaystyle -m^{-1}U'(x_{0})}$:

${\displaystyle \int _{t_{1}}^{t_{2}}2x{\ddot {x}}_{0}dt=-\int _{t_{1}}^{t_{2}}2m^{-1}xU'(x_{0}).}$

If we now assume that the trajectory ${\displaystyle x(t)}$ deviates very little from the correct trajectory ${\displaystyle x_{0}(t)}$, then we may approximately write

${\displaystyle xU'(x_{0})=(x-x_{0})U'(x_{0})+x_{0}U'(x_{0})=U(x)-U(x_{0})+O[(x-x_{0})^{2}]+x_{0}U'(x_{0}).}$

The term quadratic in ${\displaystyle (x-x_{0})}$ can be omitted under the above assumption. The terms ${\displaystyle U(x_{0})}$ and ${\displaystyle x_{0}U'(x_{0})}$ can be omitted since they are independent of ${\displaystyle x(t)}$. Thus we find that the functional ${\displaystyle S_{2}}$ is equivalent, up to inessential terms that do not vary with ${\displaystyle x(t)}$, to the following functional:

${\displaystyle S_{3}[x]=\int _{t_{1}}^{t_{2}}[{\dot {x}}^{2}-2m^{-1}U(x)]dt.}$

It is clear that ${\displaystyle S_{3}}$ is equivalent to the usual Lagrangian up to the coefficient ${\displaystyle m/2}$.

In this way, we obtained a functional ${\displaystyle S_{3}[x]}$ which has a minimum when ${\displaystyle x(t)}$ is very close to ${\displaystyle x_{0}(t)}$; i.e. it is a local minimum. The new functional does not depend explicitly on ${\displaystyle x_{0}(t)}$, just as we wanted. The price to pay is that this functional works only for small deviations from the correct trajectory. Indeed, the functional ${\displaystyle S_{3}}$ may have other minima or maxima which the original functional ${\displaystyle S_{2}}$ does not have. The only real justification for the correctness of ${\displaystyle S_{3}}$ is that the equations of motion coincide with Newton's law.

## Why can we use arbitrary coordinates to write the Lagrangian?

In simple cases, the Lagrangian is equal to the difference of the kinetic and the potential energy terms. However, one needs to select some coordinates to describe these terms. Then it is completely unimportant which variables are chosen as coordinates; these variables could be lengths, angles, or any functions of lengths and angles (but not velocities!). In other words, one can use any coordinate systems or even just parts of some coordinate systems, as long as the possible positions of every mass point is adequately described by the coordinates and the appropriate constraints. For this reason, the coordinates entering the Lagrangian are called generalized coordinates. Usually, one chooses generalized coordinates for convenience, to minimize the required computational work, or to decrease the number of necessary constraints.

However, you may be asking yourself: why is it that one is allowed to use arbitrary coordinates in the Lagrangian formalism? Certainly, as we know, Newton's laws are not the same in different coordinates: for instance, the mass times the acceleration is equal to the force only if the acceleration is computed as ${\displaystyle {\ddot {\vec {x}}}(t)}$, where ${\displaystyle {\vec {x}}(t)}$ is the vector of Cartesian coordinates ${\displaystyle (x,y,z)}$. This formula will be incorrect if the vector ${\displaystyle {\vec {x}}=(x_{1},x_{2},x_{3})}$ were to consist of, say, the radius ${\displaystyle r={\sqrt {x^{2}+y^{2}+z^{2}}}}$, the azimuthal angle ${\displaystyle \phi }$ in the ${\displaystyle (x,y)}$ plane, and the coordinate ${\displaystyle z}$. However, the Lagrangian formalism will work just fine if we express the kinetic and the potential energy through the variables ${\displaystyle (x_{1},x_{2},x_{3})=(r,\phi ,z)}$. The equations of motion will be given by the Euler-Lagrange equation,

${\displaystyle {\frac {d}{dt}}{\frac {\partial L}{\partial {\dot {\vec {x}}}}}={\frac {\partial L}{\partial {\vec {x}}}},}$

as before. One says that the Lagrangian formalism is covariant with respect to coordinate transformations.

The reason for this can be explained in two ways: either more formally, by showing that the Euler-Lagrange equations remain the same under an arbitrary change of coordinates; or more visually, by approaching the situation from the geometric point of view.

### Formal derivation

For simplicity, we shall only consider a one-dimensional problem with a Lagrangian ${\displaystyle L(q,{\dot {q}},t)}$, where ${\displaystyle q(t)}$ is a generalized coordinate. The same consideration is very easily generalized to the case of multiple coordinates.

Suppose that a new coordinate ${\displaystyle x(t)}$ is chosen instead of ${\displaystyle q(t)}$. The new coordinate can be a function of the old coordinate. Let us consider an even more general case where the change of coordinates depends on time (i.e. we may choose slightly different coordinates at different times). Then the new coordinate is related to the old one by a formula such as

${\displaystyle q(t)=F(x(t),t),}$

where ${\displaystyle F(x,t)}$ is a known function.

Now we need to express the old Lagrangian ${\displaystyle L(q,{\dot {q}},t)}$ through the new variable ${\displaystyle x}$ and its derivative ${\displaystyle {\dot {x}}}$. We have

${\displaystyle {\dot {q}}=F_{,t}+F_{,x}{\dot {x}},}$

where we denote partial derivatives by subscripts with commas, e.g. ${\displaystyle \partial f(a,b,c)/\partial a\equiv f_{,a}}$. This is a condensed notation frequently used in physics.

The Lagrangian expressed through the new variable ${\displaystyle x}$ is therefore

${\displaystyle L(q,{\dot {q}},t)={\tilde {L}}(x,{\dot {x}},t)=L(F(x,t),F_{,t}+F_{,x}{\dot {x}},t).}$

The new variable ${\displaystyle x}$ is a good variable if it is a nontrivial function of the old one, i.e. if ${\displaystyle F_{,x}\neq 0}$. Then the new Lagrangian will be a nontrivial function that depends on ${\displaystyle {\dot {x}}}$ as well as on ${\displaystyle x}$. So we shall assume that ${\displaystyle F_{,x}\neq 0}$ at least within some interval of ${\displaystyle x}$.

Now let us compare the equations of motion (EOM) that we would derive in the old coordinates and in the new coordinates.

The old EOM can be written as

${\displaystyle {\frac {d}{dt}}L_{,{\dot {q}}}=L_{,q}.}$

The new EOM is

${\displaystyle {\frac {d}{dt}}{\tilde {L}}_{,{\dot {x}}}={\tilde {L}}_{,x}.}$

Let us express this equation through ${\displaystyle L}$ instead of ${\displaystyle {\tilde {L}}}$:

${\displaystyle {\tilde {L}}_{,x}=L_{,q}F_{,x}+L_{,{\dot {q}}}(F_{,tx}+F_{,xx}{\dot {x}}),}$

${\displaystyle {\tilde {L}}_{,{\dot {x}}}=L_{,{\dot {q}}}F_{,x},}$

${\displaystyle {\frac {d}{dt}}{\tilde {L}}_{,{\dot {x}}}=F_{,x}{\frac {d}{dt}}L_{,{\dot {q}}}+L_{,{\dot {q}}}{\frac {d}{dt}}F_{,x}.}$

Therefore, the new EOM is

${\displaystyle F_{,x}{\frac {d}{dt}}L_{,{\dot {q}}}+L_{,{\dot {q}}}{\frac {d}{dt}}F_{,x}=L_{,q}F_{,x}+L_{,{\dot {q}}}(F_{,tx}+F_{,xx}{\dot {x}}).}$

Simplifying this expression, we find

${\displaystyle F_{,x}{\frac {d}{dt}}L_{,{\dot {q}}}=F_{,x}L_{,q}.}$

We find that the new EOM is indeed equivalent to the old one, under the assumption that ${\displaystyle F_{,x}\neq 0}$.

### Geometric picture

The computation presented above is straightforward and explicit, but may leave you wondering why it works. Here is a more visual explanation.

The Euler-Lagrange equations express the condition that the functional ${\displaystyle S[q]}$ has an extremum at the trajectory ${\displaystyle q(t)}$. Let us imagine a space of all trajectories, i.e. some huge space where each "point" represents one entire trajectory ${\displaystyle q(t)}$. The functional ${\displaystyle S[q]}$ has an extremum at some "point" ${\displaystyle q_{0}}$ which is the actual trajectory of the mechanical system. When we change coordinates, ${\displaystyle q\to x}$, we merely change our description of this space of trajectories. We cannot change the fact that the functional ${\displaystyle S}$ has an extremum somewhere, at some "point" ${\displaystyle q_{0}}$. We may only change our description of this "point". Therefore, after a change of variables the new functional ${\displaystyle {\tilde {S}}[x]=S[q]}$ will again have an extremum at some "point" ${\displaystyle x_{0}}$, and this "point" ${\displaystyle x_{0}}$ will have to correspond to the "point" ${\displaystyle q_{0}}$ after the change of variables. The existence of the extremum is a geometric characteristic of the shape of the functional ${\displaystyle S}$; that's why it is independent of the way we choose to describe it with coordinates.

Let us consider a simple example where we use functions instead of functionals. The function ${\displaystyle f(q)=(q-1)^{2}}$ has a minimum at ${\displaystyle q=1}$. We may change coordinates and use ${\displaystyle x}$ instead of ${\displaystyle q}$, where e.g. ${\displaystyle q=F(x)\equiv 2\sin x}$. This is a well-defined change of variables on the interval ${\displaystyle x\in (-\pi /2,\pi /2)}$, where ${\displaystyle F_{,x}\neq 0}$. In the new coordinates, the function ${\displaystyle f(q)}$ looks like ${\displaystyle {\tilde {f}}(x)=(2\sin x-1)^{2}}$. This function has a minimum at ${\displaystyle x=\pi /6}$ where ${\displaystyle 2\sin x=1}$. But geometrically speaking, this is exactly the same function as before, except viewed in different coordinates. Therefore, it is no surprise that the minimum ${\displaystyle x=\pi /6}$ is the old minimum ${\displaystyle q=1}$ after the change of coordinates.

This equivalence can be seen more formally. The condition for the minimum of the function ${\displaystyle {\tilde {f}}(x)}$ is

${\displaystyle {\frac {d}{dx}}{\tilde {f}}(x)=0={\frac {df(q)}{dq}}{\frac {dF}{dx}}.}$

This condition is equivalent to the condition for the minimum of the function ${\displaystyle f(q)}$, namely ${\displaystyle f_{,q}=0}$, as long as ${\displaystyle F_{,x}\neq 0}$. This is why the position of the minimum in the old coordinates, ${\displaystyle q=1}$, exactly corresponds to the position of the minimum in the new coordinates, ${\displaystyle x=\pi /6}$.

Similarly, when we consider functionals, we may write the condition for the minimum of ${\displaystyle {\tilde {S}}[x]=S[F(q)]}$ in new coordinates as

${\displaystyle {\frac {\delta {\tilde {S}}}{\delta x(t)}}=0={\frac {\delta S}{\delta q(t)}}{\frac {dF}{dx}}.}$

It is clear that the condition for the minimum remains the same under the change of variables, as long as the new variables are well-defined, i.e. ${\displaystyle F_{,x}\neq 0}$.

## Is the Lagrangian unique?

Another important question is whether there is only one Lagrangian that yields the correct equations of motion for a given system. The answer is that there are infinitely many different Lagrangians that can be used for any given system.

First of all, one may always multiply the Lagrangian by a constant ${\displaystyle \alpha }$ and also add an arbitrary fixed function of time, ${\displaystyle F(t)}$, to the Lagrangian. The modified Lagrangian is then ${\displaystyle {\tilde {L}}(q,{\dot {q}},t)=\alpha L(q,{\dot {q}},t)+F(t)}$. The term ${\displaystyle F(t)}$ is "fixed" in the sense that it does not depend on ${\displaystyle q(t)}$. Then we can integrate this term explicitly and express the modified action as

${\displaystyle {\tilde {S}}[q]=\alpha S[q]+\int _{t_{1}}^{t_{2}}F(t)dt.}$

The last term above is simply a number. Clearly, this modification of the action is irrelevant: if ${\displaystyle q(t)}$ is an extremum of ${\displaystyle S[q]}$, then it is also an extremum of ${\displaystyle {\tilde {S}}[q]}$. Adding a constant to a function does not change the position of the extrema.

More generally, we may add an arbitrary total time derivative to the Lagrangian:

${\displaystyle {\tilde {L}}=L+{\frac {d}{dt}}F(q,t).}$

The resulting modification of the action is

${\displaystyle {\tilde {S}}[q]=S[q]+\int _{t_{1}}^{t_{2}}{\frac {d}{dt}}F(q,t)dt=S[q]+F(q_{2},t_{2})-F(q_{1},t_{1}),}$

where ${\displaystyle q_{1},q_{2}}$ are the boundary values of ${\displaystyle q(t)}$. Since these values are fixed and do not vary when we vary ${\displaystyle q(t)}$, the extra term in the action is again a constant. Therefore, this modification of the action does not change the equations of motion. One says that two Lagrangians differing by a total derivative are equivalent.

One may even allow functions ${\displaystyle F}$ that depend on derivatives of ${\displaystyle q(t)}$ as well as on ${\displaystyle q(t)}$. However, in this case one would need to keep fixed also the values of the corresponding derivatives of ${\displaystyle q(t)}$ at the boundary points ${\displaystyle t_{1},t_{2}}$.

So, as we see, the Lagrangian for a given physical system is not unique. The recipe "kinetic energy minus potential energy" is merely a simple rule that yields a good Lagrangian.

The variety of equivalent Lagrangians is not limited to those that differ by a total derivative or by a constant coefficient. For example, the Lagrangians

${\displaystyle L(q,{\dot {q}})=q^{2}{\dot {q}}^{4},\quad {\tilde {L}}(q,{\dot {q}})=q^{3}{\dot {q}}^{6},}$

lead to the same equation of motion,

${\displaystyle [{\dot {q}}^{2}+2q{\ddot {q}}]q{\dot {q}}^{2}=0,}$

even though one obviously cannot find a function ${\displaystyle F(q,t)}$ and a constant ${\displaystyle \alpha }$ such that ${\displaystyle L=\alpha L+dF/dt}$. (Such a function would produce at most an extra ${\displaystyle F_{,q}{\dot {q}}}$ term in the Lagrangian, but not terms that are nonlinear in derivatives.)