Introduction to Chemical Engineering Processes/Numerical Root Finding Methods

Basics of Rootfinding

Rootfinding is the determination of solutions to single-variable equations or to systems of n equations in n unknowns (provided that such solutions exist). The basics of the method revolve around the determination of roots

A root of a function $F(x_{1},x_{2},...)$ in any number of variables is defined as the solution to the equation $F(x_{1},x_{2},...)=0$ . In order to use any of the numerical methods in this section, the equation should be put in a specific form, and this is one of the more common ones, used for all methods except the iterative method.

However, it is easy to put a function into this form. If you start with an equation of the form:

$F_{1}(x_{1},x_{2},...)=F_{2}(x_{1},x_{2},...)$

then subtracting $F_{2}$ will yield the required form. Do not forget to do this, even if there is only a constant on one side!

Example:

If you want to use the bisection method later in this section to find one of the solutions of the equation $1=x^{2}$ , you should rewrite the equation as $0=x^{2}-1$ so as to put it in the correct form.

Since any equation can be put into this form, the methods can potentially be applied to any function, though they work better for some functions than others.

Analytical vs. Numerical Solutions

An analytical solution to an equation or system is a solution which can be arrived at exactly using some mathematical tools. For example, consider the function $y=ln(x)$ , graphed below.

The root of this function is, by convention, when $y=0$ , or when this function crosses the x-axis. Hence, the root will occur when $ln(x)=0\rightarrow x=e^{0}=1$

The answer x=1 is an analytical solution because through the use of algebra, we were able to come up with an exact answer.

On the other hand, attempting to solve an equation like:

$-x=ln(x)$

analytically is sure to lead to frustration because it is not possible with elementary methods. In such a case it is necessary to seek a numerical solution, in which guesses are made until the answer is "close enough", but you'll never know what the exact answer is.

All that the numerical methods discussed below do is give you a systematic method of guessing solutions so that you'll be likely (and in some cases guaranteed) to get closer and closer to the true answer. The problem with numerical methods is that most are not guaranteed to work without a good enough initial guess. Therefore, it is valuable to try a few points until you get somewhere close and then start with the numerical algorithm to get a more accurate answer. They are roughly in order from the easiest to use to the more difficult but faster-converging algorithms.

Rootfinding Algorithms

Iterative solution

Iterative solutions in their purest form will solve the desired function so that it is in the form:

$x=f(x)$

Then, a value for x is guessed, and f(x) is calculated. The new value of x is then re-inserted into f(x), and the process is repeated until the value of x changes very little.

The following example illustrates this procedure.

Example:

Use an iterative solution to calculate the root of $x+ln(x)=0$

Solution: Solve the equation for x:

$e^{-x}=x$

First we need to guess an x to get it started. Let's try $x=0.5$

Then we have:

$x=e^{-0.5}=0.6065$

$x_{2}=e^{-0.6065}=0.5453$

$x_{3}=e^{-0.5453}=0.5796$

$x_{4}=e^{-0.5796}=0.5601$

$x_{5}=e^{-0.5601}=0.5711$

$x_{6}=e^{-0.5711}=0.5649$

$x_{7}=e^{-0.5649}=0.5684$

Thus to two decimal places the root is $x=0.56$ . More iterations could be performed to get a more accurate answer if desired.

This method has some rather severe limitations as we'll see in this example:

Example:

Repeat the above but this time solve for x a different way. What do you find?

Solution: To illustrate the point, let's start with a guess of $x=0.56$

The other way to solve for x is the more obvious way: $x=-ln(x)$

$x=-ln(0.56)=0.5798$

$x_{2}=-ln(0.5798)=0.5451$

$x_{3}=-ln(0.5451)=0.6068$

Clearly, even though we started with a very good guess, the solution is diverging!

This example shows that the success of the iteration method strongly depends on the properties of the function on the right-hand side. In particular, it has to do with how large the slope of the function is at the root. If the slope is too large, the method will not converge, and even if it is small the method converges slowly. Therefore, it is generally undesirable to use this method, though some more useful algorithms are based on it (which is why it is presented here).

Iterative Solution with Weights

Although the iterative solution method has its downfalls, it can be drastically improved through the use of averaging. In this method, the function is still solved for x in the form:

$x=f(x)$

From the initial guess $x_{0}$ , the function f(x) is used to generate the second guess $x_{1}$ . However, rather than simply putting $x_{1}$ into f(x), a weighted average of $x_{0}$ and $x_{1}$ is made:

$x_{1}(New)=\alpha *x_{0}+(1-\alpha )*x_{1}(old),0\leq \alpha \leq 1$

The term $\alpha$ is called the weight. The most common value of the weight is one-half, in which case the next value to plug into f(x) is simply the average of $x_{0}$ and $x_{1}(old)$ :

x_{1}(New)={\frac {x_{0}+x_{1}(Old)}{2}}

This new value is then plugged into f(x), averaged with the result, and this is repeated until convergence.

The following examples show that this method converges faster and with more reliability than normal iterative solution.

Example:

Find the root of $x+ln(x)=0$ using the iterative method with a weight of $\alpha ={\frac {1}{2}}$

Solution: Let's start with a guess of 0.5 like last time, and compare what happens this time from what happened with normal iteration.

$x_{1}=e^{-0.5}=0.6065$

$x_{1}(new)={\frac {0.5+0.6065}{2}}=0.5533$

$x_{2}=e^{-0.5533}=0.5751$

$x_{2}(new)={\frac {0.5533+0.5751}{2}}=0.5642$

$x_{3}=e^{-0.5642}=0.5688$

Here, after only three evaluations of the function (which usually takes the longest time of all the steps), we have the root to the same accuracy as seven evaluations with the other method!

The method is not only faster-converging but also more stable, so that it can actually be used solving the equation the other way too.

Example:

Starting with an initial guess of $x=0.5$ and using $x=-ln(x)$ and the weighted iteration method with $\alpha ={\frac {1}{2}}$ , find the root of the equation.

Solution: Starting with $x_{0}=0.5$ we have:

$x_{1}=-ln(0.5)=0.693$

$x_{1}(new)={\frac {0.693+0.5}{2}}=0.597$

$x_{2}=-ln(0.597)=0.517$

$x_{2}(new)={\frac {0.517+0.597}{2}}=0.557$

$x_{3}=-ln(0.557)=0.5856$

$x_{3}(new)={\frac {0.5856+0.557}{2}}=0.571$

$x_{4}=-ln(0.571)=0.560$

$x_{4}(new)={\frac {0.560+0.571}{2}}=0.565$

$x_{5}=-ln(0.565)=0.570$

Therefore we can (slowly) converge in this case using the weighted iteration method to the solution.

Notice that in this case, if we use regular iteration the result only converged if the equation was solved in a certain way. Using weighted iteration, it is possible to solve it either way and obtain a solution, but one way is clearly faster than the other. However, weighting will accelerate the algorithm in most cases and is relatively easy to implement, so it is a worthwhile method to use.

Bisection Method

Let us consider an alternative approach to rootfinding. Consider a function f(x) = 0 which we desire to find the roots of. If we let a second variable $y=f(x)$ , then y will (almost always) change sign between the left-hand side of the root and the right-hand side. This can be seen in the above picture of $y=ln(x)$ , which changes from negative to the left of the root $x=1$ to positive to its right.

The bisection method works by taking the observation that a function changes sign between two points, and narrowing the interval in which the sign change occurs until the root contained within is tightly enclosed. This only works for a continuous function, in which there are no jumps or holes in the graph, but a large number of commonly-used functions are like this including logarithms (for positive numbers), sine and cosine, and polynomials.

As a more formalized explanation, consider a function $y=f(x)$ that changes sign between $x=a$ and $x=b$ We can narrow the interval by:

Evaluating the function at the midpoint
Determining whether the function changes signs or not in each sub-interval
If the continuous function changes sign in a sub-interval, that means it contains a root, so we keep the interval.
If the function does not change sign, we discard it. This can potentially cause problems if there are two roots in the interval,so the bisection method is not guaranteed to find ALL of the roots.

Though the bisection method is not guaranteed to find all roots, it is guaranteed to find at least one if the original endpoints had opposite signs.

The process above is repeated until you're as close as you like to the root.

Example:

Find the root of $y=x+ln(x)$ using the bisection method

By plugging in some numbers, we can find that the function changes sign between $x=0.5$ $(y=-0.193)$ and $x=1$ $(y=1)$ . Therefore, since the function is continuous, there must be at least one root in this interval.

First Interval: $0.5(-)<x<1(+)$
Midpoint: $x=0.75$
y at midpoint: $y=0.75+ln(0.75)=0.462$ Therefore, the sign changes between 0.5 and 0.75 and does not between 0.75 and 1.
New Interval: $0.5(-)<x<0.75(+)$
Midpoint: $x=0.625$
y at midpoint: $y=0.155$
New Interval: $0.5(-)<x<0.625(+)$
Midpoint: $x=0.5625$
y at midpoint: $y=-0.0129$

We could keep doing this, but since this result is very close to the root, lets see if there's a number smaller than 0.625 which gives a positive function value and save ourselves some time.

x Value: $x=0.57$
y value: $y=0.00788$

Hence x lies between 0.5625 and 0.57 (since the function changes sign on this interval).

Note that convergence is slow but steady with this method. It is useful for refining crude approximations to something close enough to use a faster but non-guaranteed method such as weighted iteration.

Regula Falsi

The Regula Falsi method is similar the bisection method. You must again start with two x values between which the function f(x) you want to find the root of changes. However, this method attempts to find a better place than the midpoint of the interval to split it.It is based on the hypothesis that instead of arbitrarily using the midpoint of the interval as a guide, we should do one extra calculation to try and take into account the shape of the curve. This is done by finding the secant line between two endpoints and using the root of that line as the splitting point.

More formally:

Draw or calculate the equation for the line between the two endpoints (a,f(a)) and (b,f(b)).
Find where this line intersects the x-axis (or when y = 0), giving you x = c
Use this x value to evaluate the function, giving you f(c)
The sub-intervals are then treated as in the bisection method. If the sign changes between f(a) and f(c), keep the interval; otherwise, throw it away. Do the same between f(c) and f(b).
Repeat until you're at a desired accuracy.

Use these two formulas to solve for the secant line y = mx + B:

m={\frac {f(b)-f(a)}{b-a}}

$B=f(b)-m*b=f(a)-m*a$ (you can use either)

The regula falsi method is guaranteed to converge to a root, but it may or may not be faster than the bisection method, depending on how long it takes to calculate the slope of the line and the shape of the function.

Example:

Find the root of $x+ln(x)=0$ but this time use the regula falsi method.

Solution: Be careful with your bookkeeping with this one! It's more important to keep track of y values than it was with bisection, where all we cared about was the sign of the function, not it's actual value.

For comparison with bisection, let's choose the same initial guesses: $a=0.5$ and $b=1$ , for which $f(a)=-0.693$ and $f(b)=1$ .

First interval: $0.5<x<1,-0.193(-)<f(x)<1(+)$
Secant line: $y=2.386x-1.386$
Root of secant line: $x=0.581$
Function value at root: $f(x)=0.581+ln(0.581)=0.038(+)$

Notice that in this case, we can discard a MUCH larger interval than with the bisection method (which would use $x=0.75$ as the splitting point)

Second interval: $0.5<x<0.581,-0.193(-)<f(x)<0.038(+)$
Secant line: $y=2.852x-1.619$
Root of secant line: $x=0.5676$
Function value at root: $f(x)=0.0013$

We come up with practically the exact root after only two iterations!

In some cases, the regula falsi method will take longer than the bisection method, depending on the shape of the curve. However, it generally worth trying for a couple of iterations due to the drastic speed increases possible.

Tangent Method (Newton's Method)

In this method, we attempt to find the root of a function y = f(x) using the tangent lines to functions. This is similar to the secant method, except it "cuts loose" from the old point and only concentrates on the new one, thus hoping to avoid hang-ups such as the one experienced in the example.

Since this class assumes students have not taken calculus, the tangent will be approximated by finding the equation of a line between two very close points, which are denoted (x) and $(x+\delta x)$ . The method works as follows:

Choose one initial guess, $x_{1}$
Evaluate the function f(x) at $x=x_{1}$ and at $x=x_{1}+\delta x$ where $\delta x$ is a small number. These yield two points on your (approximate) tangent line.
Find the equation for the tangent line using the formulas given above.
Find the root of this line. This is $x_{2}$
Repeat steps 2-4 until you're as close as you like to the root.

This method is not guaranteed to converge unless you start off with a good enough first guess, which is why the guaranteed methods are useful for generating one. However, since this method, when it converges, is much faster than any of the others, it is preferable to use if a suitable guess is available.

Example:

Find the root of $x+ln(x)=y$ using the tangent method.