Undergraduate Mathematics ← Differentiable function Mean value theorem Rolle's Theorem →

In calculus, the mean value theorem states, roughly, that given a section of a smooth curve, there is at least one point on that section at which the derivative (slope) of the curve is equal (parallel) to the "average" derivative of the section.[1] It is used to prove theorems that make global conclusions about a function on an interval starting from local hypotheses about derivatives at points of the interval.

This theorem can be understood concretely by applying it to motion: If a car travels one hundred miles in one hour, then its average speed during that time was 100 miles per hour. To get at that average speed, the car either has to go at a constant 100 miles per hour during that whole time, or, if it goes slower at one moment, it has to go faster at another moment as well (and vice versa), in order to still end up with an average of 100 miles per hour. The Mean Value Theorem tells us that at some point during the journey, the car was traveling at exactly 100 miles per hour; that is, it was traveling at its average speed.

An early version of this theorem was first described by Parameshvara (1370–1460) from the Kerala school of astronomy and mathematics in his commentaries on Govindasvāmi and Bhaskara II.[2] The mean value theorem in its modern form was later stated by Augustin-Louis Cauchy (1789–1857). It is one of the most important results in differential calculus, as well as one of the most important theorems in mathematical analysis, and is essential in proving the fundamental theorem of calculus. The mean value theorem follows from the more specific statement of Rolle's theorem, and can be used to prove the more general statement of Taylor's theorem (with Lagrange form of the remainder term).

## Formal statement

Template:Calculus Let ${\displaystyle f:[a,b]\to \mathbb {R} }$ be a continuous function on the closed interval ${\displaystyle [a,b]}$ , and differentiable on the open interval ${\displaystyle (a,b)}$ , where ${\displaystyle a . Then there exists some ${\displaystyle c\in (a,b)}$ such that

${\displaystyle f'(c)={\frac {f(b)-f(a)}{b-a}}}$

The mean value theorem is a generalization of Rolle's theorem, which assumes ${\displaystyle f(a)=f(b)}$ , so that the right-hand side above is 0.

The mean value theorem is still valid in a slightly more general setting. One only needs to assume that ${\displaystyle f:[a,b]\to \mathbb {R} }$ is continuous function on ${\displaystyle [a,b]}$ , and that for every ${\displaystyle x\in (a,b)}$ the limit

${\displaystyle \lim _{h\to 0}{\frac {f(x+h)-f(x)}{h}}}$

exists as a finite number or equals ${\displaystyle \infty }$ or ${\displaystyle -\infty }$ . If finite, that limit equals ${\displaystyle f(x)}$ . An example where this version of the theorem applies is given by the real-valued cube root function mapping ${\displaystyle x}$ to ${\displaystyle {\sqrt[{3}]{x}}}$ , whose derivative tends to infinity at the origin.

Note that the theorem is false if a differentiable function is complex-valued instead of real-valued. Indeed, define ${\displaystyle f(x)=e^{xi}}$ for all real ${\displaystyle x}$ . Then

${\displaystyle f(2\pi )-f(0)=0=0(2\pi -0)}$

while

${\displaystyle {\big |}f'(x){\big |}=1}$

## Proof

The expression ${\displaystyle {\frac {f(b)-f(a)}{b-a}}}$ gives the slope of the line joining the points ${\displaystyle (a,f(a))}$ and ${\displaystyle (b,f(b))}$ , which is a chord of the graph of ${\displaystyle f}$ , while ${\displaystyle f(x)}$ gives the slope of the tangent to the curve at the point ${\displaystyle (x,f(x))}$ . Thus the Mean value theorem says that given any chord of a smooth curve, we can find a point lying between the end-points of the chord such that the tangent at that point is parallel to the chord. The following proof illustrates this idea.

Define ${\displaystyle g(x)=f(a)+{\frac {f(b)-f(a)}{b-a}}(x-a)}$ , where ${\displaystyle g}$ is the equation of the chord crossing through the points ${\displaystyle (a,f(a))}$ and ${\displaystyle (b,f(b))}$ . Its slope is of course ${\displaystyle g'(x)={\frac {f(b)-f(a)}{b-a}}}$ .

Now define ${\displaystyle h(x)={\color {blue}f(x)-g(x)}=f(x)-\left(f(a)+{\frac {f(b)-f(a)}{b-a}}(x-a)\right)}$ . Since ${\displaystyle f}$ is continuous on ${\displaystyle [a,b]}$ and differentiable on ${\displaystyle (a,b)}$ , the same is true of ${\displaystyle h}$ .

Plugging the values ${\displaystyle a,b}$ into ${\displaystyle h(x)}$ gives us that ${\displaystyle h(a)=h(b)=0}$ . This satisfies the conditions of Rolle's theorem, so there is a point ${\displaystyle c\in (a,b)}$ such that ${\displaystyle h'(c)=0}$ .

We know from the definitions that ${\displaystyle h'(c)={\color {blue}f'(c)-g'(c)}=0}$ . In addition we know that ${\displaystyle g'(c)={\frac {f(b)-f(a)}{b-a}}}$ for all ${\displaystyle c\in (a,b)}$ .

So we get

${\displaystyle f'(c)={\frac {f(b)-f(a)}{b-a}}}$

as required. ${\displaystyle \blacksquare }$

## A simple application

Assume that ${\displaystyle f}$ is a continuous, real-valued function, defined on an arbitrary interval ${\displaystyle I}$ of the real line. If the derivative of ${\displaystyle f}$ at every interior point of the interval ${\displaystyle f}$ exists and is 0, then ${\displaystyle f}$ is constant.

Proof: Assume the derivative of ${\displaystyle f}$ at every interior point of the interval ${\displaystyle I}$ exists and is 0. Let ${\displaystyle (a,b)}$ be an arbitrary open interval in ${\displaystyle I}$ . By the mean value theorem, there exists a point ${\displaystyle c\in (a,b)}$ such that

${\displaystyle f'(c)={\frac {f(b)-f(a)}{b-a}}=0}$

This implies that ${\displaystyle f(a)=f(b)}$ . Thus, ${\displaystyle f}$ is constant on the interior of ${\displaystyle I}$ and thus is constant on ${\displaystyle I}$ by continuity. (See below for a multivariable version of this result.)

Remarks:

• Only continuity of ${\displaystyle f}$ , not differentiability, is needed at the endpoints of the interval ${\displaystyle I}$ . No hypothesis of continuity needs to be stated if ${\displaystyle I}$ is an open interval, since the existence of a derivative at a point implies the continuity at this point. (See the section continuity and differentiability of the article derivative.)
• The differentiability of ${\displaystyle f}$ can be relaxed to one-sided differentiability, a proof given in the article on semi-differentiability.

## Cauchy's mean value theorem

Cauchy's mean value theorem, also known as the extended mean value theorem, is the more general form of the mean value theorem. It states: If functions ${\displaystyle f,g}$ are both continuous on the closed interval ${\displaystyle [a,b]}$ , and differentiable on the open interval ${\displaystyle (a,b)}$ , then there exists some ${\displaystyle c\in (a,b)}$ such that

Geometrical meaning of Cauchy's theorem.
${\displaystyle g'(c){\big (}f(b)-f(a){\big )}=f'(c){\big (}g(b)-g(a){\big )}}$

Of course, if ${\displaystyle g(a)\neq g(b)}$ and if ${\displaystyle g'(c)\neq 0}$ , this is equivalent to:

${\displaystyle {\frac {f'(c)}{g'(c)}}={\frac {f(b)-f(a)}{g(b)-g(a)}}}$

Geometrically, this means that there is some tangent to the graph of the curve

${\displaystyle {\begin{array}{ccc}[a,b]&\longrightarrow &\mathbb {R} ^{2}\\t&\mapsto &{\big (}f(t),g(t){\big )}\end{array}}}$

which is parallel to the line defined by the points ${\displaystyle {\big (}f(a),g(a){\big )}}$ and ${\displaystyle {\big (}f(b),g(b){\big )}}$ . However Cauchy's theorem does not claim the existence of such a tangent in all cases where ${\displaystyle {\big (}f(a),g(a){\big )}}$ and ${\displaystyle {\big (}f(b),g(b){\big )}}$ are distinct points, since it might be satisfied only for some value ${\displaystyle c}$ with ${\displaystyle f'(c)=g'(c)=0}$ , in other words a value for which the mentioned curve is stationary; in such points no tangent to the curve is likely to be defined at all. An example of this situation is the curve given by

${\displaystyle t\mapsto (t^{3},1-t^{2})}$

which on the interval $\displaystyle [−1,1]$ goes from the point $\displaystyle (−1,0)$ to ${\displaystyle (1,0)}$ , yet never has a horizontal tangent; however it has a stationary point (in fact a cusp) at ${\displaystyle t=0}$ .

Cauchy's mean value theorem can be used to prove l'Hôpital's rule. The mean value theorem is the special case of Cauchy's mean value theorem when ${\displaystyle g(t)=t}$ .

### Proof of Cauchy's mean value theorem

The Government of Beijing celebrates the Mean Value Theorem

The proof of Cauchy's mean value theorem is based on the same idea as the proof of the mean value theorem. First we need to define a new function that satisfies the conditions of Rolle's theorem. Define the function h by

${\displaystyle h(x)={\bigl (}f(b)-f(a){\bigr )}{\bigl (}g(x)-g(a){\bigr )}-{\bigl (}g(b)-g(a){\bigr )}{\bigl (}f(x)-f(a){\bigr )},}$

which is continuous on [a,b] and differentiable on (a, b). Then

${\displaystyle h(a)={\bigl (}f(b)-f(a){\bigr )}{\bigl (}g(a)-g(a){\bigr )}-{\bigl (}g(b)-g(a){\bigr )}{\bigl (}f(a)-f(a){\bigr )}=0}$

and

${\displaystyle h(b)={\bigl (}f(b)-f(a){\bigr )}{\bigl (}g(b)-g(a){\bigr )}-{\bigl (}g(b)-g(a){\bigr )}{\bigl (}f(b)-f(a){\bigr )}=0,}$

so ${\displaystyle h(a)=h(b)}$ and Rolle's theorem applies. The derivative of h is

${\displaystyle h'(x)={\bigl (}f(b)-f(a){\bigr )}g'(x)-{\bigl (}g(b)-g(a){\bigr )}f'(x)}$

and Rolle's theorem states that it is equal to zero at some point, i.e., h'(c)=0 for some c ∈ (a, b). The equation for the derivative at c is

${\displaystyle h'(c)={\bigl (}f(b)-f(a){\bigr )}g'(c)-{\bigl (}g(b)-g(a){\bigr )}f'(c)=0,}$

therefore

${\displaystyle {\bigl (}f(b)-f(a){\bigr )}g'(c)={\bigl (}g(b)-g(a){\bigr )}f'(c).}$

If ${\displaystyle g'(c)}$ and ${\displaystyle g(b)-g(a)}$ are nonzero this can be written as

${\displaystyle {\frac {f'(c)}{g'(c)}}={\frac {f(b)-f(a)}{g(b)-g(a)}}.}$

## Mean value theorem in several variables

The mean value theorem in one variable generalizes to several variables by applying the theorem in one variable via parametrization. Let G be an open subset of Rn, and let ƒ : G → R be a differentiable function. Fix points x, yG such that the interval xy lies in G, and define ${\displaystyle g(t)=f((1-t)x+ty)}$.  Since g is a differentiable function in one variable, the mean value theorem gives:

${\displaystyle g(1)-g(0)=g'(c)}$

for some c between 0 and 1. But since ${\displaystyle g(1)=f(x)}$ and ${\displaystyle g(0)=f(y)}$, computing ${\displaystyle g'(c)}$ explicitly we have:

${\displaystyle f(y)-f(x)=\nabla f((1-c)x+cy)\cdot (y-x)}$

where ${\displaystyle \nabla }$ denotes a gradient and ${\displaystyle \cdot }$ a dot product. Note that this is an exact analog of the theorem in one variable (in the case n=1 this is the theorem in one variable). By the Schwarz inequality, the equation gives the estimate:

${\displaystyle |f(y)-f(x)|\leq |\nabla f((1-c)x+cy)|\,|y-x|.}$

In particular, when the partial derivatives of ƒ are bounded, ƒ is Lipschitz continuous (a fortiori, uniformly continuous). Note that ƒ is not assumed to be continuously differentiable nor continuous on the closure of G. However, in the above, we used the chain rule so the existence of ${\displaystyle \nabla f}$ would not be sufficient.

As an application of the above, we prove that ƒ is constant if G is connected and every partial derivative of ƒ is 0. Pick some point ${\displaystyle x_{0}\in G}$, and let ${\displaystyle g(x)=f(x)-f(x_{0})}$. We want to show ${\displaystyle g(x)=0}$ for every ${\displaystyle x\in G}$. For that, let ${\displaystyle E=\{x\in G|g(x)=0\}}$. Then E is closed and nonempty. It is open too: for every xE,

${\displaystyle |g(y)|=|g(y)-g(x)|\leq (0)|y-x|=0}$

for every y in some neighborhood of x. (Here, it is crucial that x and y are sufficiently close to each other.) Since G is connected, we conclude E = G.

Remark that all arguments in the above are made in a coordinate-free manner; hence, they actually generalize to the case when G is a subset of a Banach space.

## Mean value theorem for vector-valued functions

There is no exact analog of the mean value theorem for vector-valued functions. The problem is roughly speaking the following: If ${\displaystyle f:U\rightarrow \mathbb {R} ^{m}}$ is a differentiable function (where ${\displaystyle U\subset \mathbb {R} ^{n}}$ is open) and if ${\displaystyle x+th,\,x,h\in \mathbb {R} ^{n},\,t\in [0,1]}$ is the line segment in question (lying inside ${\displaystyle \,U}$), then one can apply the above parametrization procedure to each of the component functions ${\displaystyle f_{i}\,(i=1,\ldots ,m)}$ of ${\displaystyle \,f}$ (in the above notation set ${\displaystyle \,y=x+h}$). In doing so one finds points ${\displaystyle \,x+t_{i}h}$ on the line segment satisfying

${\displaystyle f_{i}(x+h)-f_{i}(x)=\nabla f_{i}(x+t_{i}h)\cdot h}$.

But generally there will not be a single point ${\displaystyle \,x+t^{*}h}$ on the line segment satisfying

${\displaystyle f_{i}(x+h)-f_{i}(x)=\nabla f_{i}(x+t^{*}h)\cdot h}$

for all ${\displaystyle \,i}$ simultaneously. (As a counterexample one could take ${\displaystyle f:[0,2\pi ]\rightarrow \mathbb {R} ^{2}}$ defined via the component functions ${\displaystyle \,f_{1}(x)=\cos x,}$ ${\displaystyle \,f_{2}(x)=\sin x}$. Then ${\displaystyle f(2\pi )-f(0)=0\,(\in \mathbb {R} ^{2})}$, but ${\displaystyle \,f_{1}'(x)=-\sin x}$ and ${\displaystyle \,f_{2}'(x)=\cos x}$ are never simultaneously zero as ${\displaystyle \,x}$ ranges over ${\displaystyle \,[0,2\pi ]}$.)

However a certain type of generalization of the mean value theorem to vector-valued functions is obtained as follows: Let f be a continuously differentiable real-valued function defined on an open interval I, and let x as well as x+h be points of I. The mean value theorem in one variable tells us that there exists some ${\displaystyle \,t^{*}}$ between 0 and 1 such that

${\displaystyle f(x+h)-f(x)=f'(x+t^{*}h)\cdot h}$.

On the other hand we have

${\displaystyle f(x+h)-f(x)=\int _{x}^{x+h}f'(u)du=(\int _{0}^{1}f'(x+th)dt)\cdot h.}$

Thus, the value ${\displaystyle f\,'(x+t^{*}h)}$ at the particular point ${\displaystyle \,t^{*}}$ has been replaced by the mean value ${\displaystyle \int _{0}^{1}f'(x+th)dt}$. This last version can be generalized to vector valued functions:

Let ${\displaystyle U\subset \mathbb {R} ^{n}}$ be open, ${\displaystyle f:U\rightarrow \mathbb {R} ^{m}}$ continuously differentiable, and ${\displaystyle x\in U,\,h\in \mathbb {R} ^{n}}$ vectors such that the whole line segment ${\displaystyle x+th,\,0\leq t\leq 1}$ remains in ${\displaystyle \,U}$. Then we have:

${\displaystyle {\text{(*)}}\qquad f(x+h)-f(x)=(\int _{0}^{1}Df(x+th)dt)\cdot h,}$

where the integral of a matrix is to be understood componentwise. (${\displaystyle \,Df}$ denotes the Jacobian matrix of ${\displaystyle \,f}$.)

From this one can further deduce that if ||Df(x+th)|| is bounded for t between 0 and 1 by some constant M, then

${\displaystyle {\text{(**)}}\qquad ||f(x+h)-f(x)||\leq M||h||.}$

Proof of (*). Write ${\displaystyle f_{i}\,}$ (${\displaystyle i=1,\ldots ,m}$) for the real valued components of ${\displaystyle \,f}$. Define the functions ${\displaystyle g_{i}:[0,1]\rightarrow \mathbb {R} }$ by ${\displaystyle g_{i}(t)\,:=\,f_{i}(x+th).}$

Then we have

${\displaystyle f_{i}(x+h)-f_{i}(x)\,=\,g_{i}(1)-g_{i}(0)=\int _{0}^{1}g_{i}'(t)dt=\int _{0}^{1}(\sum _{j=1}^{n}{\frac {\partial f_{i}}{\partial x_{j}}}(x+th)h_{j})dt=\sum _{j=1}^{n}(\int _{0}^{1}{\frac {\partial f_{i}}{\partial x_{j}}}(x+th)dt)h_{j}.}$

The claim follows since ${\displaystyle \,Df}$ is the matrix consisting of the components ${\displaystyle {\frac {\partial f_{i}}{\partial x_{j}}}}$, q.e.d.

Proof of (**). From (*) it follows that ${\displaystyle ||f(x+h)-f(x)||=||\int _{0}^{1}(Df(x+th)\cdot h)dt||\leq \int _{0}^{1}||Df(x+th)||\cdot ||h||dt\leq M||h||.}$

Here we have used the following

Lemma. Let ${\displaystyle v:[a,b]\rightarrow \mathbb {R} ^{m}}$ be a continuous function defined on the interval ${\displaystyle [a,b]\subset \mathbb {R} }$. Then we have

'${\displaystyle {\text{(***)}}\qquad ||\int _{a}^{b}v(t)dt||\leq \int _{a}^{b}||v(t)||dt.}$

Proof of (***). Let ${\displaystyle u\in \mathbb {R} ^{m}}$ denote the value of the integral ${\displaystyle u:=\int _{a}^{b}v(t)dt.}$ Now

${\displaystyle ||u||^{2}=\langle u,u\rangle =\langle \int _{a}^{b}v(t)dt,u\rangle =\int _{a}^{b}\langle v(t),u\rangle dt\leq \int _{a}^{b}||v(t)||\cdot ||u||dt=||u||\int _{a}^{b}||v(t)||dt,}$

thus ${\displaystyle ||u||\leq \int _{a}^{b}||v(t)||dt}$ as desired. (Note the use of the Cauchy-Schwarz inequality.) This shows (***) and thereby finishes the proof of (**).

## Mean value theorems for integration

### First mean value theorem for integration

The first mean value theorem for integration states

If G : [a, b] → R is a continuous function and φ : [a, b] → R is an integrable positive function, then there exists a number x in (a, b) such that
${\displaystyle \int _{a}^{b}G(t)\varphi (t)\,dt=G(x)\int _{a}^{b}\varphi (t)\,dt.}$

In particular for φ(t) = 1, there exists x in (a, b) such that

${\displaystyle \int _{a}^{b}G(t)\,dt=\ G(x)(b-a).\,}$

### Proof of the first mean value theorem for integration

Let ${\displaystyle m=\inf\{G(x):x\in [a,b]\}}$ and ${\displaystyle M=\sup\{G(x):x\in [a,b]\}}$. It follows that

${\displaystyle m\int _{a}^{b}\varphi (t)\,dt\leq \int _{a}^{b}G(t)\varphi (t)\,dt\leq M\int _{a}^{b}\varphi (t)\,dt}$

by monotonicity of the integral. Dividing through by ${\displaystyle \int _{a}^{b}\varphi (t)\,dt,}$ we have that

${\displaystyle m\leq {\frac {\int _{a}^{b}G(t)\varphi (t)\,dt}{\int _{a}^{b}\varphi (t)\,dt}}\leq M.}$

Since G(t) is continuous, the intermediate value theorem implies that there exists x in [ab] such that

${\displaystyle G(x)={\frac {\int _{a}^{b}G(t)\varphi (t)\,dt}{\int _{a}^{b}\varphi (t)\,dt}}}$

which completes the proof.

### Second mean value theorem for integration

There are various slightly different theorems called the second mean value theorem for integration. A commonly found version is as follows:

If G : [a, b] → R is a positive monotonically decreasing function and φ : [a, b] → R is an integrable function, then there exists a number x in (ab] such that
${\displaystyle \int _{a}^{b}G(t)\varphi (t)\,dt=G(a+0)\int _{a}^{x}\varphi (t)\,dt.}$

\ Here G(a + 0) stands for ${\displaystyle {{\underset {a_{+}}{\lim }}G}}$ , the existence of which follows from the conditions. Note that it is essential that the interval (ab] contains b. A variant not having this requirement is:

If G : [a, b] → R is a monotonic (not necessarily decreasing and positive) function and φ : [a, b] → R is an integrable function, then there exists a number x in (a, b) such that
${\displaystyle \int _{a}^{b}G(t)\varphi (t)\,dt=G(a+0)\int _{a}^{x}\varphi (t)\,dt+G(b-0)\int _{x}^{b}\varphi (t)\,dt.}$

This variant was proved by Hiroshi Okamura in 1947.[citation needed]

## References

1. J. J. O'Connor and E. F. Robertson (2000). Paramesvara, MacTutor History of Mathematics archive.