Real Analysis/Differentiation in Rn

From Wikibooks, open books for an open world
< Real Analysis
Jump to: navigation, search
Real Analysis
Differentiation in Rn

We will first revise some important concepts of Linear Algebra that are of importance in Multivariate Analysis. The reader with no background in Linear Algebra is advised to refer the book Linear Algebra.

Vector Space[edit]

A set \mathcal{V} is said to be a Vector Space over a field F if and only if operations addition and scalar multiplication are defined over it so as to satisfy for all \mathbf{v_1}, \mathbf{v_2},\ldots\in\mathcal{V} and c_1,c_2\in F



(iii)Identity:There exists \mathbf{0}\in\mathcal{V} such that \mathbf{v_1}+\mathbf{0}=\mathbf{v_1}=\mathbf{0}+\mathbf{v_1}

(iv)Inverse:There exists -\mathbf{v_1}\in\mathcal{V} such that \mathbf{v_1}+(-\mathbf{v_1})=\mathbf{0}




Members of a vector space are called "Vectors" and those of the field are called "Scalars". \mathbb{R}^n, the set of all polynomials etc. are examples of vector spaces

A set of linearly independant vectors that spans the vector space is said to be a Basis for the vector space.

Linear Transformations[edit]

Let X,Y be vector spaces.

Let T:X\to Y

We say that T is a Linear transformation if and only if for all \mathbf{v_1},\mathbf{v_2}\in X,



As we will see, there are two major ways to define a 'derivative' of a multivariable function. We first present the seemingly more straightforward way of using "Partial Derivatives".

Directional and Partial Derivatives[edit]

Let \mathbf{f}:\mathbb{R}^n\to \mathbb{R}^m

Let \mathbf{a},\mathbf{y}\in \mathbb{R}^n

We say that \mathbf{f} is differentiable at \mathbf{a}\in\mathbb{R}^n with respect to vector \mathbf{y} if and only if there exists \mathbf{L}\in\mathbb{R}^m that satisfies

\lim_{h\to 0}\frac{\mathbf{f}(\mathbf{a}+h\mathbf{y})-\mathbf{f}(\mathbf{a})}{h}=\mathbf{L}

\mathbf{L} is said to be the derivative of \mathbf{f} at \mathbf{a} with respect to \mathbf{y} and is written as \mathbf{f}'(\mathbf{a};\mathbf{y})

When \mathbf{y} is a unit vector, the derivative is said to be a partial derivative. Here we will explicitly define partial derivatives and see some of their properties.

Let f be a real multivariate function defined on an open subset \Omega of  \mathbb{R}^{n}

 f: \Omega \longrightarrow \mathbb{R} .

Then the partial derivative at some parameter  (x_1,...,x_n) with respect to the coordinate x_i is defined as the following limit

 \lim_{h \rightarrow 0} {f(x_1,\ldots,x_i+h,\ldots,x_n)-f(x_1,\ldots,x_i,\ldots,x_n) \over h} = { \partial f \over \partial x_i } .

f is said to be differentiable at this parameter  (x_1,...,x_n) if the difference  f(x_1,...,x_i+h,...,x_n)-f(x_1,...,x_i,...,x_n) is equivalent up to first order in h to a linear form L (of h), that is

 f(x_1,...,x_i+h,...,x_n)-f(x_1,...,x_i,...,x_n) = L\times h + o(\|h\|).

The linear form L is then said to be the differential of f at (x_1,...,x_n) , and is written as  Df|_{(x_1,\ldots,x_n)} or sometimes  \mathrm{d}f(x_1,\ldots,x_n) .

In this case, where f is differentiable at (x_1,\ldots,x_n) , by linearity we can write

 \mathrm{d}f={\partial f \over \partial x_1}\mathrm{d} x_1+\ldots+
                                    {\partial f \over \partial x_n}\mathrm{d} x_n

f is said to be continuously differentiable if its differential is defined at any parameter in its domain, and if the differential is varying continuously relative to the parameter  (x_1,...,x_n) , that is if it coordinates (as a linear form)  \partial f \over \partial x_1 are varying continuously.

In case partial derivatives exists but f is not differentiable, and sometimes not even continuous exempli gratia

 f:(x,y)\mapsto {(xy)^2\over (x^2+y^2)}

(and  f(0,0)=0 ) we say that f is separably differentiable.

Total Derivatives[edit]

The total derivative is important as it preserves some of the key properties of the single variable derivative, most notably the assertion differentiability implies continuity

Let f:A\subseteq\mathbb{R}^n\to \mathbb{R}^m

We say that f is differentiable at \mathbf{a}\in A if and only if there exists a linear transformation, \mathbf{D} f(\mathbf{a}):\mathbb{R}^n\to \mathbb{R}^m, called the derivative or total derivative of f at \mathbf{a}, such that

\lim_{\| \mathbf{h} \| \to 0}\frac{\| f(\mathbf{a}+\mathbf{h})-f(\mathbf{a})-\mathbf{D} f(\mathbf{a})(\mathbf{h}) \|}{\| \mathbf{h} \|}=0

One should read \mathbf{D} f(\mathbf{a})(\mathbf{h}) as the linear transformation \mathbf{D} f(\mathbf{a}) applied to the vector \mathbf{h}. Sometimes it is customary to write this as \mathbf{D} f(\mathbf{a}) \cdot (\mathbf{h}).


Suppose A \subseteq \mathbb{R}^n is an open set and f:A\to \mathbb{R}^m is differentiable on A. Think of writing f in components so f(x_1,\ldots,x_n)=(f_1(x_1,\ldots,x_n),\ldots,f_m(x_1,\ldots,x_n)). Then the partial derivatives \frac{\partial f_j}{\partial x_i} exist, and the matrix representing the linear transformation \mathbf{D} f(\mathbf{x}) with respect to the standard bases of \mathbb{R}^n and \mathbb{R}^m is given by the Jacobian Matrix:

\begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_1}{\partial x_n} \\ \vdots & \ddots & \vdots \\ \frac{\partial f_m}{\partial x_1} & \cdots & \frac{\partial f_m}{\partial x_n}  \end{bmatrix}.

evaluated at \mathbf{x}=(x_1,\ldots,x_n).

NOTE: This theorem requires the function to be differentiable to begin with. It is a common mistake to assume that if the partial derivatives exist then this would imply that the function is differentiable because we can construct the Jacobian matrix. This however is completely false. Which brings us to the next theorem:


Suppose A \subseteq \mathbb{R}^n is an open set and f:A\to \mathbb{R}^m. Think of writing f in components so f(x_1,\ldots,x_n)=(f_1(x_1,\ldots,x_n),\ldots,f_m(x_1,\ldots,x_n)). If \frac{\partial f_j}{\partial x_i} exists and is continuous on A for all j\in \{1,\ldots, m \} and for all i\in \{1,\ldots,n \}, then f is differentiable on A.

This theorem gives us a nice criteria for a function to be differentiable.