Introduction to Mathematical Physics/Differentials and derivatives

Definitions

Definition:

Let $E$ and $F$ two normed vectorial spaces on $R$ or $C$ .and $f$ a map defined on an open $U$ of $E$ into $F$ . $f$ is said differentiable at a point $x_{0}$ of $U$ if there exists a continuous linear application $g$ from $E$ into $F$ such that $\|f(x)-f(x_{0})-g(x-x_{0})\|$ is negligible with respect to $\|x-x_{0}\|$ .

The notion of derivative is less general and is usually defined for function for a part of $R$ to a vectorial space as follows:

Definition:

Let $I$ be an interval of $R$ different from a point and $E$ a vectorial space normed on $R$ . An application $f$ from $I$ to $E$ admits a derivative at the point $x$ of $I$ if the ratio:

{\frac {f(x+h)-f(x)}{h}}

admits a limit as $h$ tends to zero. This limit is then called derivative of $f$ at point $x$ and is noted $f^{\prime }(x)$ .

We will however see in this appendix some generalization of derivatives.

Derivatives in the distribution's sense

Definition

Derivative\index{derivative in the distribution sense} in the usual function sense is not defined for non continuous functions. Distribution theory allows in particular to generalize the classical derivative notion to non continuous functions.

Definition:

Derivative of a distribution $T$ is distribution $T'$ defined by:

\forall \phi \in {\mathcal {D}},<T'|\phi >=-<T|\phi '>

Definition:

Let $f$ be a summable function. Assume that $f$ is discontinuous at $N$ points $a_{i}$ , and let us note $\sigma _{a_{i}}=f(a_{i}^{+})-f(a_{i}^{-})$ the jump of $f$ at $a_{i}$ . Assume that $f'$ is locally summable and almost everywhere defined. It defines a distribution $T_{f'}$ . Derivative $(T_{f})'$ of the distribution associated to $f$ is:

(T_{f})'=T_{f'}+\sum \sigma _{a_{i}}\delta _{a_{i}}

One says that the derivative in the distribution sense is equal to the derivative without precaution augmented by the Dirac distribution multiplied by the jump of $f$ . It can be noted:

f'=\{f'\}+\sum \sigma _{a_{i}}\delta _{a_{i}}

Case of distributions of several variables

secdisplu

Using derivatives without precautions, the action of differential operators in the distribution sense can be written, in the case where the functions on which they are acting are discontinuous on a surface $S$ :

{\frac {\partial f}{\partial x_{i}}}=\{{\frac {\partial f}{\partial x_{i}}}\}+n_{i}\sigma _{f}\delta _{S}

{\mbox{ grad }}f=\{{\mbox{ grad }}f\}+n\sigma _{f}\delta _{S}

{\mbox{ div }}a=\{{\mbox{ div }}f\}+n\sigma _{a}\delta _{S}

{\mbox{ rot }}a=\{{\mbox{ rot }}f\}+n\wedge \sigma _{a}\delta _{S}

where $f$ is a scalar function, $a$ a vectorial function, $\sigma$ represents the jump of $a$ or $f$ through surface $S$ and $\delta _{S}$ , is the surfacic Dirac distribution. Those formulas allow to show the Green function introduced for tensors. The geometrical implications of the differential operators are considered at next appendix chaptens

Example: Electromagnetism The fundamental laws of electromagnetism are the Maxwell equations:\index{passage relation}

{\mbox{ rot }}E=-{\frac {\partial B}{\partial t}}

{\mbox{ rot }}H=j+{\frac {\partial D}{\partial t}}

{\mbox{ div }}D=\rho

{\mbox{ div }}B=0

are also true in the distribution sense. In books on electromagnetism, a chapter is classically devoted to the study of the boundary conditions and passage conditions. The using of distributions allows to treat this case as a particular case of the general equations. Consider for instance, a charge distribution defined by:

\rho =\rho _{v}+\rho _{s}

where $\rho _{v}$ is a volumic charge and $\rho _{s}$ a surfacic charge, and a current distribution defined by:

j=j_{v}+j_{s}

where $j_{v}$ is a volumic current, and $j_{s}$ a surfacic current. Using the formulas of section secdisplu, one obtains the following passage relations:

{\begin{matrix}n_{12}\wedge (E_{2}-E_{1})&=&0\\n_{12}\wedge (H_{2}-H_{1})&=&j_{s}\\n_{12}(D_{2}-D_{1})&=&\rho _{s}\\n_{12}(B_{2}-B_{1})&=&0\end{matrix}}

where the coefficients of the Delta surfacic distribution $\delta _{s}$ have been identified (see ([#References

Example: Electrical circuits

As Maxwell equations are true in the distribution sense (see previous example), the equation of electricity are also true in the distribution sense. Distribution theory allows to justify some affirmations sometimes not justified in electricity courses. Consider equation:

U(t)=L{\frac {di}{dt}}+Ri

This equation implies that even if $U$ is not continuous, $i$ does. Indeed , if $i$ is not continuous, derivative ${\frac {di}{dt}}$ would create a Dirac distribution in the second member. Consider equation:

i={\frac {dq}{dt}}

This equation implies that $q(t)$ is continuous even if $i$ is discontinuous.

Example: Fluid mechanics Conservation laws are true at the distribution sense. Using distribution derivatives, so called "discontinuity" relations can be obtained immediately ([#References

Differentiation of Stochastic processes

secstoch

When one speaks of stochastic\index{stochastic process} processes ([#References|references]), one adds the time notion. Taking again the example of the dices, if we repeat the experiment $N$ times, then the number of possible results is $\Omega '=6^{N}$ (the size of the set $\Omega$ grows exponentially with $N$ ). We can define using this $\Omega '$ a probability $P'$ . So, from the first random variable $X$ , we can define another random variable $X_{t}$ :

Definition:

Let $X$ a random variable\index{random variable}. A stochastic process (associated to $X$ ) is a function of $X$ and $t$ .

$X_{t}$ is called a stochastic function of $X$ or a

stochastic process. Generally probability $P(X_{t}\in {\mathrel {[}}x,x+dx{\mathrel {[}}{\mbox{ at }}t_{i})$ depends on the history of values of $X_{t}$ before $t_{i}$ . One defines the conditional probability $P(X_{t=t_{i}}\in {\mathrel {[}}x,x+dx{\mathrel {[}}|X_{t\leq t_{i}})$ as the probability of $X_{t}$ to take a value between $x$ and $x+dx$ , at time $t_{i}$ knowing the values of $X_{t}$ for times anterior to $t_{i}$ (or $X_{t}$ "history"). A Markov process is a stochastic process with the property that for any set of succesive times $t_{1},\dots ,t_{n}$ one has:

P_{1|n-1}(X_{t=t_{n}}\in {\mathrel {[}}x,x+dx{\mathrel {[}}|X_{t_{1}}\dots X_{t_{n}})=P_{1|1}(X_{t=t_{n}}\in {\mathrel {[}}x,x+dx{\mathrel {[}}|X_{t_{n-1}})

$P_{i|j}$ denotes the probability for $i$ conditions to be satisfied, knowing $j$ anterior events. In other words, the expected value of $X_{t}$ at time $t_{n}$ depends only on the value of $X_{t}$ at previous time $t_{n-1}$ . It is defined by the transition matrix by $P_{1}$ and $P_{1|1}$ (or equivalently by the transition density function $f_{1}(x,t)$ and $f_{1|1}(x_{2},t_{2}|x_{1},t_{1})$ . It can be seen ([#References|references]) that two functions $f_{1}$ and $f_{1|1}$ defines a Markov\index{Markov process} process if and only if they verify:

the Chapman-Kolmogorov equation\index{Chapman-Kolmogorov equation}:

f_{1|1}(x_{3},t_{3})=\int f_{1|1}(x_{3},t_{3}|x_{2},t_{2})f_{1|1}(x_{2},t_{2}|x_{1},t_{1})dx_{2}

eqnecmar

f_{1}(x_{2},t_{2})=\int f_{1|1}(y_{2},t_{2}|y_{1},t_{1})f_{1}(y_{1},t_{1})dx_{1}

A Wiener process\index{Wiener process}\index{Brownian motion} (or Brownian motion) is a Markov process for which:

f_{1|1}(x_{2},t_{2}|x_{1},t_{1})={\frac {1}{\sqrt {2\pi (t_{2}-t_{1})}}}e^{-{\frac {(x_{2}-x_{1})^{2}}{2(t_{2}-t_{1})}}}

Using equation eqnecmar, one gets:

f_{1}(x,t)={\frac {1}{2\pi }}e^{-{\frac {x^{2}}{2t}}}

As stochastic processes were defined as a function of a random variable and time, a large class\footnote{This definition excludes however discontinuous cases such as Poisson processes} of stochastic processes can be defined as a function of Brownian motion (or Wiener process) $W_{t}$ . This our second definition of a stochastic process:

Definition:

Let $W_{t}$ be a Brownian motion. A stochastic process is a function of $W_{t}$ and $t$ .

For instance a model of the temporal evolution of stocks ([#References|references]) is

X_{t}=e^{(\sigma W_{t}+(\mu -{\frac {1}{2}}\sigma ^{2})t)}

A stochastic differential equation

dX_{t}=a(t,X_{t})dt+b(t,X_{t})dW_{t}

gives an implicit definition of the stochastic process. The rules of differentiation with respect to the Brownian motion variable $W_{t}$ differs from the rules of differentiation with respect to the ordinary time variable. They are given by the It\^o formula\index{It\^o formula} ([#References|references]). To understand the difference between the differentiation of a newtonian function and a stochastic function consider the Taylor expansion, up to second order, of a function $f(W_{t})$ :

f(W_{t}+dW_{t})-f(W_{t})=f^{'}(W_{t})dW_{t}+{\frac {1}{2}}f^{''}(W_{t})(dW_{t})^{2}+\dots

Usually (for newtonian functions), the differential $df(W_{t})$ is just $f^{'}(W_{t})dW_{t}$ . But, for a stochastic process $f(W_{t})$ the second order term ${\frac {1}{2}}f^{''}(W_{t})(dW_{t})^{2}$ is no more neglectible. Indeed, as it can be seen using properties of the Brownian motion, we have:

\int _{0}^{t}(dW_{s})^{2}=t

or

(dW_{t})^{2}=dt.

Figure figbrown illustrates the difference between a stochastic process (simple brownian motion in the picture) and a differentiable function. The brownian motion has a self similar structure under progressive zooms. \begin{figure} \begin{tabular}[t]{c c}

\epsffile{b0_3} \epsffile{n0_3}

\epsffile{b0_4} \epsffile{n0_4}

\epsffile{b0_5} \epsffile{n0_5} \end{tabular} | center | frame |Comparison of a progressive zooming on a brownian motion and on a differentiable function}

figbrown

]]

Let us here just mention the most basic scheme to integrate stochastic processes using computers. Consider the time integration problem:

dX_{t}=a(t,X_{t})dt+b(t,X_{t})dW_{t}

with initial value:

X_{t_{0}}=X_{0}

The most basic way to approximate the solution of previous problem is to use the Euler (or Euler-Maruyama). This schemes satisfies the following iterative scheme:

X_{n+1}=X_{n}+a(\tau _{n},Y_{n})(\tau _{n+1}-\tau _{n})+b(\tau _{n},Y_{n})(W_{\tau _{n+1}}-W_{\tau _{n}})

More sofisticated methods can be found in ([#References|references]).

Functional derivative

Let $(\phi )$ be a functional. To calculate the differential $dI(\phi )$ of a functional $I(\phi )$ one express the difference $I(\phi +d\phi )-I(\phi )$ as a functional of $d\phi$ .

The functional derivative of $I$ noted ${\frac {\delta I}{\delta \phi }}$ is given by the limit:

{\frac {\delta I}{\delta \phi }}=\lim _{a\rightarrow 0}{\frac {\partial I}{\partial \phi _{i}}}

where $a$ is a real and $\phi _{i}=\phi (ia)$ .

Here are some examples:

Example:

If $I(\phi )=\int f(y)\phi ^{p}(y)dy$ then ${\frac {\delta I}{\delta \phi }}=pf(x)\phi ^{p-1}(x)$

Example:

If $I(\phi )=\int V(\phi (y))dy$ then ${\frac {\delta I}{\delta \phi }}=V^{\prime }(\phi (x))$ .

chapretour

Comparison of tensor values at different points

Expansion of a function in serie about x=a

Definition:

A function $f$ admits a expansion in serie at order $n$ around $x=a$ if there exists number $(\lambda _{1},\dots ,\lambda _{n})$ such that:

f(a+h)=\sum _{k=0}^{n}\lambda _{k}h^{k}+h^{n}\epsilon (h)

where $\epsilon (h)$ tends to zero when $h$ tends to zero.

Theorem:

If a function is derivable $n$ times in $a$ , then it admits an expansion in serie at order $n$ around $x=a$ and it is given by the Taylor-Young formula:

f(a+h)=\sum _{k=0}^{n}{\frac {1}{k!}}f^{(k)}(a)h^{k}+h^{n}\epsilon (h)

where $\epsilon (h)$ tends to zero when $h$ tends to zero and where $f^{(k)}(a)$ is the $k$ derivative of $f$ at $x=a$ .

Note that the reciproque of the theorem is false: $f(x)={\frac {1}{x^{3}}}\sin(x)$ is a function that admits a expansion around zero at order 2 but isn't two times derivable.

secderico

Non objective quantities

Consider two points $M$ and $M'$ of coordinates $x^{i}$ and $x^{i}+dx^{i}$ . A first variation often considered in physics is:

eqapdai

d(a^{i}e_{i})={\frac {\partial a^{i}}{\partial x^{j}}}dx^{j}e_{i}

The non objective variation is

da^{i}={\frac {\partial a^{i}}{\partial x^{j}}}dx^{j}

Note that $da^{i}$ is not a tensor and that equation eqapdai assumes that $e_{i}$ doesn't change from point $M$ to point $M'$ . It doesn't obey to tensor transformations relations. This is why it is called non objective variation. An objective variation that allows to define a tensor is presented at next section: it takes into account the variations of the basis vectors.

exmpderr

Example: Lagrangian speed: the Lagrangian description of the mouvement of a particle number $a$ is given by its position $r_{a}$ at each time $t$ . If

r_{a}(t)=x^{i}(t)e_{i}

the Lagrangian speed is:

{\frac {dr_{a}}{dt}}={\frac {dx^{i}}{dt}}e_{i}

Derivative introduced at example exmpderr is not objective, that means that it is not invariant by axis change. In particular, one has the famous vectorial derivation formula:

eqvectderfor

{\frac {dA}{dt}}_{R}={\frac {dA}{dt}}_{R_{1}}+\omega _{R_{1}/R}\wedge A

Example:

Eulerian description of a fluid is given by a field of "Eulerian" $v(x,t)$ velocities and initial conditions, such that:

x=r_{a}(0)

where $r_{a}$ is the Lagrangian position of the particle, and:

v(x,t)={\frac {dr_{a}}{dt}}.

Eulerian and Lagrangian descriptions are equivalent.

Example:

Let us consider the variation of the speed field $u$ between two positions, at time $t$ . If speed field $u$ is differentiable, there exists a linear mapping $K$ such that:

eqchampudif

u_{i}({\vec {r}}+d{\vec {r}})-u_{i}({\vec {r}})=K.\delta {\vec {r}}_{j}+O(||{\vec {r}}||)

$K_{ij}=u_{i,j}$ is called the speed field gradient tensor. Tensor $K$ can be shared into a symmetric and an antisymmetric part:

K=\left({\begin{array}{ccc}e_{11}&e_{12}&e_{13}\\e_{21}&e_{22}&e_{23}\\e_{31}&e_{32}&e_{33}\\\end{array}}\right)+\left({\begin{array}{ccc}0&-s_{3}&s_{2}\\.&0&-s_{1}\\.&.&0\\\end{array}}\right)

Symmetric part is called dilatation tensor, antisymmetric part is called rotation tensor. Now, $u_{i}({\vec {r}}+d{\vec {r}})-u_{i}({\vec {r}})={\frac {d\delta {\vec {r}}}{dt}}$ . Thus using equation eqchampudif:

{\frac {d\delta {\vec {r}}}{dt}}=K\delta {\vec {r}}

This result true for vector $\delta {\vec {r}}$ is also true for any vector ${\vec {a}}$ . This last equation allows to show that

The derivative with respect to time of the elementary volume $dv$ at the neighbourhood of a particle that is followed in its movement is\footnote{ Indeed

d(\delta v)=d(\delta x)\delta y\delta z+d(\delta y)\delta x\delta z+d(\delta z)\delta x\delta y

eqformvol

{\frac {d(\delta v)}{dt}}={\mbox{ div }}u\delta v

} :

{\frac {d(dv)}{dt}}={\mbox{ div }}udv

The speed field of a solid is antisymmetric^[1].

exmppartder

Example: Particulaire derivative of a tensor: The particulaire derivative is the time derivative of a quantity defined on a set of particles that are followed during their movement. When using Lagrange variables, it can be identified to the partial derivative with respect to time ([#References

The following property can be showed ([#References|references]): \begin{prop} Let us consider the integral:

I=\int _{V}\omega

where $V$ is a connex variety of dimension $p$ (volume, surface...) that is followed during its movement and $\omega$ a differential form of degree $p$ expressed in Euler variables. The particular derivative of $I$ verifies:

{\frac {d}{dt}}\int _{V}\omega =\int _{V}{\frac {d\omega }{dt}}

\end{prop} A proof of this result can be found in ([#References|references]).

Example:

Consider the integral

I=\int _{V}C(x,t)dv

where $D$ is a bounded connex domain that is followed during its movement, $C$ is a scalar valuated function continuous in the closure of $D$ and differentiable in $D$ . The particulaire derivative of $I$ is

{\frac {dI}{dt}}=\int _{D}\{{\frac {\partial C}{\partial t}}+{\mbox{ div }}(C{\vec {u}})\}dv,

since from equation eqformvol:

{\frac {d}{dt}}(dv)={\mbox{ div }}{\vec {u}}dv.

secandericov

Covariant derivative

In this section a derivative that is independent from the considered reference frame is introduced (an objective derivative). Consider the difference between a quantity $a$ evaluated in two points $M$ and $M'$ .

da=a(M')-a(M)=da^{i}e_{i}+a^{i}de_{i}

As at section secderico:

da^{i}e_{i}={\frac {\partial a^{i}}{\partial x^{j}}}dx^{j}e_{i}

Variation $de_{i}$ is linearly connected to the $e_{j}$ 's {\it via} the tangent application:

de_{i}=d\omega _{i}^{j}e_{j}

Rotation vector depends linearly on the displacement:

eqchr

de_{i}=\Gamma _{ik}^{j}dx^{k}e_{j}

Symbols $\Gamma _{ik}^{j}$ called Christoffel symbols^[2] are not^[3] tensors. they connect properties of space at $M$ and its properties at point $M'$ . By a change of index in equation eqchr :

eqcovdiff

da^{i}e_{i}={\frac {\partial a^{i}}{\partial x^{j}}}dx^{j}e_{i}+a^{k}\Gamma _{kj}^{i}dx^{j}e_{i}

As the $x^{j}$ 's are independent variables:

Definition:

The covariant derivative of a contravariant vector $a^{i}$ is

eqdefdercov

{\frac {Da^{i}}{Dx^{j}}}={\frac {\partial a^{i}}{\partial x^{j}}}+a^{k}\Gamma _{kj}^{i}

The differential can thus be noted:

da^{i}={\frac {Da^{i}}{Dx^{j}}}dx^{j},

which is the generalization of the differential:

da_{i}={\frac {\partial a_{i}}{\partial x^{j}}}dx^{j}

considered when there are no tranformation of axes. This formula can be generalized to tensors.

Remark:

For the calculation of the particulaire derivative exposed at section

secderico the  $x^{j}$  are the coordinates of the point, but the quantity

to derive depends also on time. That is the reason why a term ${\frac {\partial x^{j}}{\partial t}}$ appear in equation eqformalder but not in equation

eqdefdercov.

Remark:

From equation eqdefdercov the vectorial derivation formula of equation

eqvectderfor can be recovered when:

de_{i}=\omega _{i}^{j}dte_{j}

Remark:

In spaces with metrics, $\Gamma _{kj}^{i}$ are functions of the metrics tensor $g^{ij}$ .

Covariant differential operators

Following differential operators with tensorial properties can be defined:

Gradient of a scalar:
$a={\mbox{ grad }}V$
with $a_{i}={\frac {\partial V}{\partial x^{i}}}$ .
Rotational of a vector
$b={\mbox{ rot }}a_{i}$
with $b_{ik}={\frac {\partial a_{k}}{\partial x^{i}}}-{\frac {\partial a_{i}}{\partial x^{k}}}$ . the tensoriality of the rotational can be shown using the tensoriality of the covariant derivative:
${\frac {\partial a_{k}}{\partial x^{i}}}-{\frac {\partial a_{i}}{\partial x^{k}}}={\frac {Da_{k}}{Dx^{i}}}-{\frac {Da_{i}}{Dx^{k}}}$
Divergence of a contravariant density:
$d={\mbox{ div }}a^{i}$
where $d={\frac {\partial a^{i}}{\partial x^{i}}}$ .

For more details on operators that can be defined on tensors, see

([#References|references]).

In an orthonormal euclidian space on has the following relations:

{\mbox{ rot }}({\mbox{ grad }}\phi )=0

and

{\mbox{ div }}({\mbox{ rot }}(a))=0

\nabla \wedge (\nabla \wedge c)=\nabla (\nabla .c)-\nabla ^{2}a

↑ Indeed, let $u$ and $v$ be two position vectors binded to the solid. By definition of a solid, scalar product $uv$ remains constant as time evolves. So:
${\frac {d(uv)}{dt}}=0$

${\frac {du}{dt}}v+u{\frac {dv}{dt}}=0$
So:
$K_{ij}u_{j}v_{i}+u_{i}K_{ij}v_{J}=0$
As this equality is true for any $u,v$ , one has:
$K_{ij}=-K_{ji}$
In other words, $K$ is antisymmetrical. So, from the preceeding theorem:
${\frac {dPQ}{dt}}=\Omega _{i,j}(PQ)_{j}$
This can be rewritten saying that speed field is antisymmetrical, {\it i. e.}, one has:
$V_{P}=V_{O}+\Omega \wedge (OP)$
↑ I a space with metrics $g_{ij}$ coefficients $\Gamma _{hk}^{i}$ can expressed as functions of coefficients $g_{ij}$ .
↑ Just as ${\frac {\partial a^{i}}{\partial x^{j}}}$ is not a tensor. However, $d(a^{i}e_{i})$ given by equation eqcovdiff does have the tensors properties

[1] Indeed, let $u$ and $v$ be two position vectors binded to the solid. By definition of a solid, scalar product $uv$ remains constant as time evolves. So:
${\frac {d(uv)}{dt}}=0$

${\frac {du}{dt}}v+u{\frac {dv}{dt}}=0$
So:
$K_{ij}u_{j}v_{i}+u_{i}K_{ij}v_{J}=0$
As this equality is true for any $u,v$ , one has:
$K_{ij}=-K_{ji}$
In other words, $K$ is antisymmetrical. So, from the preceeding theorem:
${\frac {dPQ}{dt}}=\Omega _{i,j}(PQ)_{j}$
This can be rewritten saying that speed field is antisymmetrical, {\it i. e.}, one has:
$V_{P}=V_{O}+\Omega \wedge (OP)$

[2] I a space with metrics $g_{ij}$ coefficients $\Gamma _{hk}^{i}$ can expressed as functions of coefficients $g_{ij}$ .

[3] Just as ${\frac {\partial a^{i}}{\partial x^{j}}}$ is not a tensor. However, $d(a^{i}e_{i})$ given by equation eqcovdiff does have the tensors properties

[1]

[2]

[3]

Introduction to Mathematical Physics/Differentials and derivatives

Contents

Definitions

Derivatives in the distribution's sense

Definition

Case of distributions of several variables

Differentiation of Stochastic processes

Functional derivative

Comparison of tensor values at different points

Expansion of a function in serie about x=a

Non objective quantities

Covariant derivative

Covariant differential operators

Navigation menu

Introduction to Mathematical Physics/Differentials and derivatives

Definitions

Derivatives in the distribution's sense

Definition

Case of distributions of several variables

Differentiation of Stochastic processes

Functional derivative

Comparison of tensor values at different points

Expansion of a function in serie about x=a

Non objective quantities

Covariant derivative

Covariant differential operators

Navigation menu

Search