Introduction to Mathematical Physics/Differentials and derivatives

From Wikibooks, open books for an open world
< Introduction to Mathematical Physics
Jump to: navigation, search



Let E and F two normed vectorial spaces on R or C.and f a map defined on an open U of E into F. f is said differentiable at a point x_0 of U if there exists a continuous linear application g from E into F such that \|f(x)-f(x_0)-g(x-x_0)\| is negligible with respect to \|x-x_0\|.

The notion of derivative is less general and is usually defined for function for a part of R to a vectorial space as follows:


Let I be an interval of R different from a point and E a vectorial space normed on R. An application f from I to E admits a derivative at the point x of I if the ratio:


admits a limit as h tends to zero. This limit is then called derivative of f at point x and is noted f^\prime(x).

We will however see in this appendix some generalization of derivatives.

Derivatives in the distribution's sense[edit]


Derivative\index{derivative in the distribution sense} in the usual function sense is not defined for non continuous functions. Distribution theory allows in particular to generalize the classical derivative notion to non continuous functions.


Derivative of a distribution T is distribution T' defined by:

\forall \phi \in {\mathcal D}, <T'|\phi>=-<T|\phi'>


Let f be a summable function. Assume that f is discontinuous at N points a_i, and let us note \sigma_{a_i}=f(a_i^+)-f(a_i^-) the jump of f at a_i. Assume that f' is locally summable and almost everywhere defined. It defines a distribution T_{f'}. Derivative (T_f)' of the distribution associated to f is:

(T_f)'=T_{f'}+\sum \sigma_{a_i} \delta_{a_i}

One says that the derivative in the distribution sense is equal to the derivative without precaution augmented by the Dirac distribution multiplied by the jump of f. It can be noted:

f'=\{f'\}+\sum \sigma_{a_i} \delta_{a_i}

Case of distributions of several variables[edit]


Using derivatives without precautions, the action of differential operators in the distribution sense can be written, in the case where the functions on which they are acting are discontinuous on a surface S:

\frac{\partial f}{\partial x_i}=\{\frac{\partial f}{\partial
x_i}\}+n_i\sigma_f \delta_S

\mbox{ grad } f=\{\mbox{ grad } f\}+n \sigma_f \delta_S

\mbox{ div } a=\{ \mbox{ div } f\}+n \sigma_a \delta_S

\mbox{ rot } a = \{\mbox{ rot } f\}+n\wedge \sigma_a \delta_S

where f is a scalar function, a a vectorial function, \sigma represents the jump of a or f through surface S and \delta_S, is the surfacic Dirac distribution. Those formulas allow to show the Green function introduced for tensors. The geometrical implications of the differential operators are considered at next appendix chaptens

Example: Electromagnetism The fundamental laws of electromagnetism are the Maxwell equations:\index{passage relation}

\mbox{ rot } E=-\frac{\partial B}{\partial t}

\mbox{ rot } H=j+\frac{\partial D}{\partial t}

\mbox{ div } D=\rho

\mbox{ div } B=0

are also true in the distribution sense. In books on electromagnetism, a chapter is classically devoted to the study of the boundary conditions and passage conditions. The using of distributions allows to treat this case as a particular case of the general equations. Consider for instance, a charge distribution defined by:


where \rho_v is a volumic charge and \rho_s a surfacic charge, and a current distribution defined by:


where j_v is a volumic current, and j_s a surfacic current. Using the formulas of section secdisplu, one obtains the following passage relations:


where the coefficients of the Delta surfacic distribution \delta_s have been identified (see ([#References

Example: Electrical circuits

As Maxwell equations are true in the distribution sense (see previous example), the equation of electricity are also true in the distribution sense. Distribution theory allows to justify some affirmations sometimes not justified in electricity courses. Consider equation:


This equation implies that even if U is not continuous, i does. Indeed , if i is not continuous, derivative \frac{di}{dt} would create a Dirac distribution in the second member. Consider equation:


This equation implies that q(t) is continuous even if i is discontinuous.

Example: Fluid mechanics Conservation laws are true at the distribution sense. Using distribution derivatives, so called "discontinuity" relations can be obtained immediately ([#References

Differentiation of Stochastic processes[edit]


When one speaks of stochastic\index{stochastic process} processes ([#References|references]), one adds the time notion. Taking again the example of the dices, if we repeat the experiment N times, then the number of possible results is \Omega'=6^N (the size of the set \Omega grows exponentialy with N). We can define using this \Omega' a probability P'. So, from the first random variable X, we can define another random variable X_t:


Let X a random variable\index{random variable}. A stochastic process (associated to X) is a function of X and t.

X_t is called a stochastic function of X or a

stochastic process. Generally probability P(X_t\in \mathrel{[}x,x+dx\mathrel{[} \mbox{ at  } t_i) depends on the history of values of X_t before t_i. One defines the conditional probability P(X_{t=t_i}\in \mathrel{[}x,x+dx\mathrel{[}|X_{t\leq t_i}) as the probability of X_t to take a value between x and x+dx, at time t_i knowing the values of X_t for times anterior to t_i (or X_t "history"). A Markov process is a stochastic process with the property that for any set of succesive times t_1,\dots,t_n one has:

P_{1|n-1}(X_{t=t_n}\in \mathrel{[}x,x+dx\mathrel{[}|X_{t_1}\dots X_{t_n})=P_{1|1}(X_{t=t_n}\in \mathrel{[}x,x+dx\mathrel{[}|X_{t_{n-1}})

P_{i|j} denotes the probability for i conditions to be satisfied, knowing j anterior events. In other words, the expected value of X_t at time t_n depends only on the value of X_t at previous time t_{n-1}. It is defined by the transition matrix by P_1 and P_{1|1} (or equivalently by the transition density function f_1(x,t) and f_{1|1}(x_2,t_2|x_1,t_1). It can be seen ([#References|references]) that two functions f_1 and f_{1|1} defines a Markov\index{Markov process} process if and only if they verify:

  • the Chapman-Kolmogorov equation\index{Chapman-Kolmogorov equation}:

f_{1|1}(x_3,t_3)=\int f_{1|1}(x_3,t_3|x_2,t_2)f_{1|1}(x_2,t_2|x_1,t_1)  dx_2


f_1(x_2,t_2)=\int f_{1|1}(y_2,t_2|y_1,t_1)f_1(y_1,t_1)dx_1

A Wiener process\index{Wiener process}\index{Brownian motion} (or Brownian motion) is a Markov process for which:

f_{1|1}(x_2,t_2|x_1,t_1)= \frac{1}{\sqrt{2\pi (t_2-t_1)}}e^{-\frac{(x_2-x_1)^2}{2(t_2-t_1)}}

Using equation eqnecmar, one gets:


As stochastic processes were defined as a function of a random variable and time, a large class\footnote{This definition excludes however discontinuous cases such as Poisson processes} of stochastic processes can be defined as a function of Brownian motion (or Wiener process) W_t. This our second definition of a stochastic process:


Let W_t be a Brownian motion. A stochastic process is a function of W_t and t.

For instance a model of the temporal evolution of stocks ([#References|references]) is

X_t=e^{(\sigma W_t+(\mu -\frac{1}{2}\sigma^2)t)}

A stochastic differential equation


gives an implicit definition of the stochastic process. The rules of differentiation with respect to the Brownian motion variable W_t differs from the rules of differentiation with respect to the ordinary time variable. They are given by the It\^o formula\index{It\^o formula} ([#References|references]). To understand the difference between the differentiation of a newtonian function and a stochastic function consider the Taylor expansion, up to second order, of a function f(W_t):


Usually (for newtonian functions), the differential df(W_t) is just f^{'}(W_t)dW_t. But, for a stochastic process f(W_t) the second order term \frac{1}{2}f^{''}(W_t)(dW_t)^2 is no more neglectible. Indeed, as it can be seen using properties of the Brownian motion, we have:




Figure figbrown illustrates the difference between a stochastic process (simple brownian motion in the picture) and a differentiable function. The brownian motion has a self similar structure under progressive zooms. \begin{figure} \begin{tabular}[t]{c c}

\epsffile{b0_3} \epsffile{n0_3}

\epsffile{b0_4} \epsffile{n0_4}

\epsffile{b0_5} \epsffile{n0_5} \end{tabular} | center | frame |Comparison of a progressive zooming on a brownian motion and on a differentiable function}



Let us here just mention the most basic scheme to integrate stochastic processes using computers. Consider the time integration problem:


with initial value:


The most basic way to approximate the solution of previous problem is to use the Euler (or Euler-Maruyama). This schemes satisfies the following iterative scheme:


More sofisticated methods can be found in ([#References|references]).

Functional derivative[edit]

Let (\phi) be a functional. To calculate the differential dI(\phi) of a functional I(\phi) one express the difference I(\phi+d\phi)-I(\phi) as a functional of d\phi.

The functional derivative of I noted \frac{\delta I}{\delta \phi} is given by the limit:

\frac{\delta I}{\delta \phi}=\lim_{a\rightarrow 0}\frac{\partial
I}{\partial \phi_i}

where a is a real and \phi_i=\phi(ia).

Here are some examples:


If I(\phi)=\int f(y)\phi^p(y)dy then \frac{\delta I}{\delta


If I(\phi)=\int V(\phi(y))dy then \frac{\delta I}{\delta


Comparison of tensor values at different points[edit]

Expansion of a function in serie about x=a[edit]


A function f admits a expansion in serie at order n around x=a if there exists number (\lambda_1,\dots,\lambda_n) such that:

f(a+h)=\sum_{k=0}^n \lambda_k h^k+h^n\epsilon(h)

where \epsilon(h) tends to zero when h tends to zero.


If a function is derivable n times in a, then it admits an expansion in serie at order n around x=a and it is given by the Taylor-Young formula:

f(a+h)=\sum_{k=0}^n \frac{1}{k!}f^{(k)}(a) h^k+h^n\epsilon(h)

where \epsilon(h) tends to zero when h tends to zero and where f^{(k)}(a) is the k derivative of f at x=a.

Note that the reciproque of the theorem is false: f(x)=\frac{1}{x^3}\sin(x) is a function that admits a expansion around zero at order 2 but isn't two times derivable.


Non objective quantities[edit]

Consider two points M and M' of coordonates x^i and x^i+dx^i. A first variation often considered in physics is:


d(a^ie_i)=\frac{\partial a^i}{\partial x^j}dx^j e_i

The non objective variation is

da^i=\frac{\partial a^i}{\partial x^j}dx^j

Note that da^i is not a tensor and that equation eqapdai assumes that e_i doesn't change from point M to point M'. It doesn't obey to tensor transformations relations. This is why it is called non objective variation. An objective variation that allows to define a tensor is presented at next section: it takes into account the variations of the basis vectors.


Example: Lagrangian speed: the Lagrangian description of the mouvement of a particle number a is given by its position r_a at each time t. If


the Lagrangian speed is:


Derivative introduced at example exmpderr is not objective, that means that it is not invariant by axis change. In particular, one has the famous vectorial derivation formula:


\frac{dA}{dt}_R=\frac{dA}{dt}_{R_1}+\omega_{R_1/R}\wedge A


Eulerian description of a fluid is given by a field of "Eulerian" v(x,t) velocities and initial conditions, such that:


where r_a is the Lagrangian position of the particle, and:


Eulerian and Lagrangian descriptions are equivalent.


Let us consider the variation of the speed field u between two positions, at time t. If speed field u is differentiable, there exists a linear mapping K such that:


u_i(\vec r+d\vec r)-u_i(\vec r)=K.\delta\vec r_j + O(||\vec r||)

K_{ij}=u_{i,j} is called the speed field gradient tensor. Tensor K can be shared into a symmetric and an antisymmetric part:

\left( \begin{array}{ccc}
\end{array} \right)
\left( \begin{array}{ccc}
\end{array} \right)

Symmetric part is called dilatation tensor, antisymmetric part is called rotation tensor. Now, u_i(\vec r+d\vec r)-u_i(\vec r)=\frac{d\delta\vec{r}}{dt}. Thus using equation eqchampudif:

\frac{d\delta\vec r}{dt}=K\delta\vec r

This result true for vector \delta\vec r is also true for any vector \vec
a. This last equation allows to show that

  • The derivative with respect to time of the elementary volume dv at the neighbourhood of a particle that is followed in its movement is\footnote{ Indeed

  d(\delta v)=d(\delta x)\delta y\delta z+d(\delta y)\delta x\delta z+d(\delta z)\delta x\delta y


\frac{d(\delta v)}{dt}=\mbox{ div } u \delta v

} :

  \frac{d(dv)}{dt}=\mbox{ div } u dv

  • The speed field of a solid is antisymmetric[1].


Example: Particulaire derivative of a tensor: The particulaire derivative is the time derivative of a quantity defined on a set of particles that are followed during their movement. When using Lagrange variables, it can be identified to the partial derivative with respect to time ([#References

The following property can be showed ([#References|references]): \begin{prop} Let us consider the integral:


where V is a connex variety of dimension p (volume, surface...) that is followed during its movement and \omega a differential form of degree p expressed in Euler variables. The particular derivative of I verifies:

\frac{d}{dt}\int_V \omega = \int_V \frac{d \omega}{dt}

\end{prop} A proof of this result can be found in ([#References|references]).


Consider the integral

I=\int_V C(x,t) dv

where D is a bounded connex domain that is followed during its movement, C is a scalar valuated function continuous in the closure of D and differentiable in D. The particulaire derivative of I is

\frac {dI}{dt}=\int_D\{\frac{\partial
C}{\partial t}+\mbox{ div }(C\vec{u})\}dv,

since from equation eqformvol:

\frac{d}{dt}(dv)=\mbox{ div }\vec{u} dv.


Covariant derivative[edit]

In this section a derivative that is independent from the considered reference frame is introduced (an objective derivative). Consider the difference between a quantity a evaluated in two points M and M'.

da=a(M')-a(M)=da^i e_i+a^i de_i

As at section secderico:

da^i e_i=\frac{\partial a^i}{\partial x^j}dx^j e_i

Variation de_i is linearly connected to the e_j's {\it via} the tangent application:


Rotation vector depends linearly on the displacement:



Symbols \Gamma^{j}_{ik} called Christoffel symbols[2] are not[3] tensors. they connect properties of space at M and its properties at point M'. By a change of index in equation eqchr :


da^i e_i=\frac{\partial a^i}{\partial x^j}dx^j e_i+a^k\Gamma^i_{kj}

As the x^j's are independent variables:


The covariant derivative of a contravariant vector a^i is


\frac{Da^i}{Dx^j}=\frac{\partial a^i}{\partial x^j}+a^k\Gamma^i_{kj}

The differential can thus be noted:


which is the generalization of the differential:

da_i=\frac{\partial a_i}{\partial x^j}dx^j

considered when there are no tranformation of axes. This formula can be generalized to tensors.


For the calculation of the particulaire derivative exposed at section

secderico the x^j are the coordinates of the point, but the quantity

to derive depends also on time. That is the reason why a term \frac{\partial
  x^j}{\partial t} appear in equation eqformalder but not in equation



From equation eqdefdercov the vectorial derivation formula of equation

eqvectderfor can be recovered when: 

de_i=\omega_i^j dt e_j


In spaces with metrics, \Gamma^i_{kj} are functions of the metrics tensor g^{ij}.

Covariant differential operators[edit]

Following differential operators with tensorial properties can be defined:

  • Gradient of a scalar:

  a=\mbox{ grad } V

with a_i=\frac{\partial V}{\partial x^i}.

  • Rotational of a vector

  b=\mbox{ rot } a_i

with b_{ik}=\frac{\partial a_k}{\partial x^i}-\frac{\partial  a_i}{\partial x^k}. the tensoriality of the rotational can be shown using the tensoriality of the covariant derivative:

  \frac{\partial a_k}{\partial x^i}-\frac{\partial  a_i}{\partial x^k}=\frac{D a_k}{D x^i}-\frac{D  a_i}{D x^k}

  • Divergence of a contravariant density:

  d=\mbox{ div } a^i

where d=\frac{\partial a^i}{\partial x^i}.

For more details on operators that can be defined on tensors, see


In an orthonormal euclidian space on has the following relations:

\mbox{ rot }(\mbox{ grad }\phi)=0


\mbox{ div }(\mbox{ rot }(a))=0

\nabla\wedge (\nabla\wedge c)=\nabla(\nabla.c)-\nabla^2a

  1. Indeed, let u and v be two position vectors binded to the solid. By definition of a solid, scalar product uv remains constant as time evolves. So:





    As this equality is true for any u,v, one has:


    In other words, K is antisymmetrical. So, from the preceeding theorem:


    This can be rewritten saying that speed field is antisymmetrical, {\it i. e.}, one has:


  2. I a space with metrics g_{ij} coefficients \Gamma^{i}_{hk} can expressed as functions of coefficients g_{ij}.
  3. Just as \frac{\partial a^i}{\partial x^j} is not a tensor. However, d(a^ie_i) given by equation eqcovdiff does have the tensors properties