Calculus/Derivatives of multivariate functions

From Wikibooks, open books for an open world
Jump to: navigation, search
← Multivariable calculus Calculus The chain rule and Clairaut's theorem →
Derivatives of multivariate functions

The matrix of a linear transformation[edit]


A linear transformation amounts to multiplication by a uniquely defined matrix; that is, there exists a unique matrix such that


We set the column vectors

where is the standard basis of . Then we define from this

and note that for any vector of we obtain

Thus, we have shown existence. To prove uniqueness, suppose there were any other matrix with the property that . Then in particular,

which already implies that (since all the columns of both matrices are identical).

How to generalise the derivative[edit]

It is not immediately straightforward how one would generalize the derivative to higher dimensions. For, if we take the definition of the derivative at a point

and insert vectors for and , we would divide the whole thing by a vector. But this is not defined.

Hence, we shall rephrase the definition of the derivative a bit and cast it into a form where it can be generalized to higher dimensions.


Let be a one-dimensional function and let . Then is differentiable at if and only if there exists a linear function such that

We note that according to the above, linear functions are given by multiplication by a -matrix, that is, a scalar.


First assume that is differentiable at . We set and obtain

which converges to 0 due to the definition of .

Assume now that we are given an such that

Let be the scalar associated to . Then by an analogous computation .

With the latter formulation of differentiability from the above theorem, we may readily generalize to higher dimensions, since division by the Euclidean norm of a vector is defined, and linear mappings are also defined in higher dimensions.


A function is called differentiable or totally differentiable at a point if and only if there exists a linear function such that

We have already proven that this definition coincides with the usual one in the one-dim. case (that is ).

We have the following theorem:


Let be a set, let be an interior point of , and let be a function differentiable at . Then the linear map such that

is unique; that is, there exists only one such map .


Since is an interior point of , we find such that . Let now be any other linear mapping with the property that

We note that for all vectors of the standard basis , the numbers for are contained within . Hence, we obtain by the triangle inequality

Taking , we see that . Thus, and coincide on all basis vectors, and since every other vector can be expressed as a linear combination of those, by linearity of and we obtain .

Thus, the following definition is justified:


Let be a function (where is a subset of ), and let be an interior point of such that is differentiable at . Then the unique linear function such that

is called the differential of at and is denoted .

Directional and partial derivatives[edit]

We shall first define directional derivatives.


Let be a function, and let be a vector. If the limit

exists, it is called directional derivative of in direction . We denote it by .

The following theorem relates directional derivatives and the differential of a totally differentiable function:


Let be a function that is totally differentiable at , and let be a nonzero vector. Then exists and is equal to .


According to the very definition of total differentiability,


by multiplying the above equation by . Noting that

the theorem follows.

A special case of directional derivatives are partial derivatives:


Let be the standard basis of , let and let be a function such that the directional derivatives all exist. Then we set

and call it the partial derivative in the direction of .

In fact, by writing down the definition of , we see that the partial derivative in the direction of is nothing else than the derivative of the function in the variable at the place . That is, for instance, if


that is, when forming a partial derivative, we regard the other variables as constant and derive only with respect to the variable we are considering.

The Jacobian matrix[edit]

From the above, we know that the differential of a function has an associated matrix representing the linear map thus defined. Under a condition, we can determine this matrix from the partial derivatives of the component functions.


Let be such that all partial derivatives exist at and are continuous in each component on for a possibly very small, but positive . Then is totally differentiable at and the differential of is given by left multiplication by the matrix

where .

The matrix is called the Jacobian matrix.


We shall now prove that all summands of the last sum go to 0.

Indeed, let . Writing again , we obtain by the one-dimensional mean value theorem, first applied in the first variable, then in the second and so on, the succession of equations

for suitably chosen . We can now sum all these equations together to obtain

Let now . Using the continuity of the on , we may choose such that

for , given that (which we may assume as ). Hence, we obtain

and thus the theorem.


If is continuously differentiable at and , then