Calculus/Inverse function theorem, implicit function theorem

From Wikibooks, open books for an open world
Jump to navigation Jump to search
← The chain rule and Clairaut's theorem Calculus Vector calculus →
Inverse function theorem, implicit function theorem

In this chapter, we want to prove the inverse function theorem (which asserts that if a function has invertible differential at a point, then it is locally invertible itself) and the implicit function theorem (which asserts that certain sets are the graphs of functions).

Banach's fixed point theorem[edit | edit source]

Theorem:

Let be a complete metric space, and let be a strict contraction; that is, there exists a constant such that

.

Then has a unique fixed point, which means that there is a unique such that . Furthermore, if we start with a completely arbitrary point , then the sequence

converges to .

Proof:

First, we prove uniqueness of the fixed point. Assume are both fixed points. Then

.

Since , this implies .

Now we prove existence and simultaneously the claim about the convergence of the sequence . For notation, we thus set and if is already defined, we set . Then the sequence is nothing else but the sequence .

Let . We claim that

.

Indeed, this follows by induction on . The case is trivial, and if the claim is true for , then .

Hence, by the triangle inequality,

.

The latter expression goes to zero as and hence we are dealing with a Cauchy sequence. As we are in a complete metric space, it converges to a limit . This limit further is a fixed point, as the continuity of ( is Lipschitz continuous with constant ) implies

.

A corollary to this important result is the following lemma, which shall be the main ingredient for the proof of the inverse function theorem:

Lemma:

Let ( denoting the closed ball of radius ) be a function which is Lipschitz continuous with Lipschitz constant less or equal such that . Then the function

is injective and .

Proof:

First, we note that for the function

is a strict contraction; this is due to

.

Furthermore, it maps to itself, since for

.

Hence, the Banach fixed-point theorem is applicable to . Now being a fixed point of is equivalent to

,

and thus follows from the existence of fixed points. Furthermore, if , then

and hence . Thus injectivity.

The inverse function theorem[edit | edit source]

Theorem:

Let be a function which is continuously differentiable in a neighbourhood such that is invertible. Then there exists an open set with such that is a bijective function with an inverse which is differentiable at and satisfies

.

Proof:

We first reduce to the case , and . Indeed, suppose for all those functions the theorem holds, and let now be an arbitrary function satisfying the requirements of the theorem (where the differentiability is given at ). We set

and obtain that is differentiable at with differential and ; the first property follows since we multiply both the function and the linear-affine approximation by and only shift the function, and the second one is seen from inserting . Hence, we obtain an inverse of with it's differential at , and if we now set

,

it can be seen that is an inverse of with all the required properties (which is a bit of a tedious exercise, but involves nothing more than the definitions).

Thus let be a function such that , is invertible at and . We define

.

The differential of this function is zero (since taking the differential is linear and the differential of the function is the identity). Since the function is also continuously differentiable at a small neighbourhood of , we find such that

for all and . Since further , the general mean-value theorem and Cauchy's inequality imply that for and ,

for suitable . Hence,

(triangle inequality),

and thus, we obtain that our preparatory lemma is applicable, and is a bijection on , whose image is contained within the open set ; thus we may pick , which is open due to the continuity of .

Thus, the most important part of the theorem is already done. All that is left to do is to prove differentiability of at . Now we even prove the slightly stronger claim that the differential of at is given by the identity, although this would also follow from the chain rule once differentiability is proven.

Note now that the contraction identity for implies the following bounds on :

.

The second bound follows from

,

and the first bound follows from

.

Now for the differentiability at . We have, by substitution of limits (as is continuous and ):

where the last expression converges to zero due to the differentiability of at with differential the identity, and the sandwhich criterion applied to the expressions

and

.

The implicit function theorem[edit | edit source]

Theorem:

Let be a continuously differentiable function, and consider the set

.

If we are given some such that , then we find open with and such that

and ,

where is open with respect to the subspace topology of .

Furthermore, is a differentiable function.

Proof:

We define a new function

.

The differential of this function looks like this:

Since we assumed that , is invertible, and hence the inverse function theorem implies the existence of a small open neighbourhood containing such that restricted to that neighbourhood is itself invertible, with a differentiable inverse , which is itself defined on an open set containing . Now set first

,

which is open with respect to the subspace topology of , and then

,

the -th component of . We claim that has the desired properties.

Indeed, we first note that , since applying leaves the first components unchanged, and thus we get the identity by observing . Let thus . Then

.

Furthermore, the set

is open with respect to the subspace topology on . Indeed, we show

.

For , we first note that the set on the left hand side is in , since all points in it are mapped to zero by . Further,

and hence is completed when applying . For the other direction, let a point in be given, apply to get

and hence ; further

by applying to both sides of the equation.

Now is automatically differentiable as the component of a differentiable function.


Informally, the above theorem states that given a set , one can choose the first coordinates as a "base" for a function, whose graph is precisely a local bit of that set.