7. Root-finding Without Derivatives

References:

  • Section 1.5.1 Secant Method and variants in Sauer

  • Section 2.3 Newton’s Method and it Extensions in Burden&Faires; just the later sub-sections, on The Secant Method and The Method of False Position).

7.1. Introduction

We have already seen one method for solving \(f(x) = 0\) without needing to know any derivatives of \(f\): the Bisection Method, a.k.a. Interval Halving. However, we have also seen that that method is far slower then Newton’s Method.

Here we explore methods that are almost the best of both worlds: about as fast as Newton’s method but not needing derivatives.

The first of these is the Secant Method. Later in this course we will see how this has been merged with the Bisection Method and Polynomial Interpolation to produce the current state-of-the-art approach; only perfected in the 1960’s.

# We will often need resources from the modules numpy and pyplot:
import numpy as np
from matplotlib import pyplot as plt

# We can also import items from a module individually, so they can be used by "first name only";
from numpy import abs, cos

# Since we do a lot of graphics in this section, some more short-hands:
from matplotlib.pyplot import figure, title, plot, xlabel, ylabel, grid, legend, show
from numpy import linspace

# Also, some from the module for this book:
from numerical_methods_module import newton

7.2. Using Linear Approximation Without Derivatives

One quirk of the Bisection Method is that it only used the sign of the values \(f(a)\) and \(f(b)\), not their magnitudes. If one of these is far smaller than the other, one might guess that the root is closer to that end of the interval. This leads to the idea of:

  • starting with an interval \([a, b]\) known to contain a zero of \(f\),

  • connecting the two points \((a, f(a))\) and \((b, f(b))\) with a straight line, and

  • finding the \(x\)-value \(c\) where this line crosses the \(x\)-axis. In the words, aproximating the function by a secant line, in place of the tangent line used in Newton’s Method.

7.2.1. First Attempt: The Method of False Position

The next step requires some care. The first idea (from almost a millenium ago) was to use this new approximation \(c\) as done with bisection: check which of the intervals \([a, c]\) and \([c,b]\) has the sign change and use it as the new interval \([a, b]\); this is called The Method of False Position (or Regula Falsi, since the academic world used latin in those days.)

The secant line between \((a, f(a))\) and \((b, f(b))\) is

\[ L(x) = \frac{f(a)(b-x) + f(b)(x-a)}{b-a} \]

and its zero is at

\[ c = \frac{a f(b) - f(a) b}{f(b) - f(a)} \]

This is easy to implement, and an example will show that it sort of works, but with a weakness that hampers it a bit:

def false_position(f, a, b, errorTolerance=1e-15, maxIterations=15, demoMode=False):
    """Solve f(x)=0 in the interval [a, b] by the Method of False Position.
    This code also illustrates a few ideas that I encourage, such as:
    - Avoiding infinite loops, by using for loops sand break
    - Avoiding repeated evaluation of the same quantity
    - Use of descriptive variable names
    - Use of "camelCase" to turn descriptive phrases into valid Python variable names
    - An optional "demonstration mode" to display intermediate results.
    """
    if demoMode: print(f"Solving by the Method of False Position.")
    fa = f(a)
    fb = f(b)
    for iteration in range(maxIterations):
        if demoMode: print(f"\nIteration {iteration}:")
        c = (a * fb - fa * b)/(fb - fa)
        fc = f(c)
        if fa * fc < 0:
            b = c
            fb = fc  # N.B. When b is updated, so must be fb = f(b)
        else:
            a = c
            fa = fc
        errorBound = b - a
        if demoMode:
            print(f"The root is in interval [{a}, {b}]")
            print(f"The new approximation is {c}, with error bound {errorBound:0.4}, backward error {abs(fc):0.4}")
        if errorBound < errorTolerance:
            break
    # Whether we got here due to accuracy of running out of iterations,
    # return the information we have, including an error bound:
    root = c  # the newest value is probably the most accurate
    return (root, errorBound)

Note: For a more concise presentation, you could omit the above def and instead import this function with

from numerical_methods_module import false_position
def f(x): return x - cos(x)
(root, errorBound) = false_position(f, a=-1, b=1, demoMode=True)
print(f"\nThe Method of False Position gave approximate root is {root},")
print(f"with estimate error {errorBound:0.4}, backward error {abs(f(root)):0.4}")
Solving by the Method of False Position.

Iteration 0:
The root is in interval [0.5403023058681398, 1]
The new approximation is 0.5403023058681398, with error bound 0.4597, backward error 0.3173

Iteration 1:
The root is in interval [0.7280103614676172, 1]
The new approximation is 0.7280103614676172, with error bound 0.272, backward error 0.01849

Iteration 2:
The root is in interval [0.7385270062423998, 1]
The new approximation is 0.7385270062423998, with error bound 0.2615, backward error 0.000934

Iteration 3:
The root is in interval [0.7390571666782676, 1]
The new approximation is 0.7390571666782676, with error bound 0.2609, backward error 4.68e-05

Iteration 4:
The root is in interval [0.7390837322783136, 1]
The new approximation is 0.7390837322783136, with error bound 0.2609, backward error 2.345e-06

Iteration 5:
The root is in interval [0.7390850630385933, 1]
The new approximation is 0.7390850630385933, with error bound 0.2609, backward error 1.174e-07

Iteration 6:
The root is in interval [0.7390851296998365, 1]
The new approximation is 0.7390851296998365, with error bound 0.2609, backward error 5.883e-09

Iteration 7:
The root is in interval [0.7390851330390691, 1]
The new approximation is 0.7390851330390691, with error bound 0.2609, backward error 2.947e-10

Iteration 8:
The root is in interval [0.7390851332063397, 1]
The new approximation is 0.7390851332063397, with error bound 0.2609, backward error 1.476e-11

Iteration 9:
The root is in interval [0.7390851332147188, 1]
The new approximation is 0.7390851332147188, with error bound 0.2609, backward error 7.394e-13

Iteration 10:
The root is in interval [0.7390851332151385, 1]
The new approximation is 0.7390851332151385, with error bound 0.2609, backward error 3.708e-14

Iteration 11:
The root is in interval [0.7390851332151596, 1]
The new approximation is 0.7390851332151596, with error bound 0.2609, backward error 1.776e-15

Iteration 12:
The root is in interval [0.7390851332151606, 1]
The new approximation is 0.7390851332151606, with error bound 0.2609, backward error 1.11e-16

Iteration 13:
The root is in interval [0.7390851332151607, 1]
The new approximation is 0.7390851332151607, with error bound 0.2609, backward error 0.0

Iteration 14:
The root is in interval [0.7390851332151607, 1]
The new approximation is 0.7390851332151607, with error bound 0.2609, backward error 0.0

The Method of False Position gave approximate root is 0.7390851332151607,
with estimate error 0.2609, backward error 0.0

The good news is that the approximations are approaching the zero reasonably fast — far faster than bisection — as indicated by the backward errors improving by a factor of better than ten at each iteration.

The bad news is that one end gets “stuck”, so the interval does not shrink on both sides, and the error bound stays large.

This behavior is generic: with function \(f\) of the same convexity on the interval \([a, b]\), the secant line will always cross on the same side of the zero, so that one end-point persists; in this case, the curve is concave up, so the secant line always crosses to the left of the root, as seen in the following graphs.

def graph_false_position(f, a, b, maxIterations=3):
    """Graph a few iterations of the Method of False Position for solving f(x)=0 in the interval [a, b].
    """
    fa = f(a)
    fb = f(b)
    for iteration in range(maxIterations):
        c = (a * fb - fa * b)/(fb - fa)
        fc = f(c)
        abc = [a,b, c]
        left = np.min(abc)
        right = np.max(abc)
        x = linspace(left, right)
        figure(figsize=[16,6])
        title(f"Iteration {iteration+1}, Method of False Position")
        xlabel("$x$")
        plot(x, f(x))
        plot([left, right], [f(left), f(right)])  # the secant line
        plot([left, right], [0, 0], 'k')  # the x-axis line
        plot(abc, f(abc), 'r*')
        show()  # The Windows version of JupytLab might need this command; it is harmless anyway.
        if fa * fc < 0:
            b = c
            fb = fc  # N.B. When b is updated, so must be fb = f(b)
        else:
            a = c
            fa = fc
graph_false_position(f, a=-1, b=1)
../_images/root-finding-without-derivatives-python_12_0.png ../_images/root-finding-without-derivatives-python_12_1.png ../_images/root-finding-without-derivatives-python_12_2.png

7.2.2. Refinement: Alway Use the Two Most Recent Approximations — The Secant Method

The basic solution is to always discard the oldest approximation — at the cost of not always having the zero surrounded! This gives the Secant Method.

For a mathemacal description, one typically enumerates the successive approximations as \(x_0\), \(x_1\), etc., so the notation above gets translated with \(a \to x_{k-2}\), \(b \to x_{k-1}\), \(c \to x_{k}\); then the formula becomes the recursive rule

\[ x_k = \frac{x_{k-2} f(x_{k-1}) - f(x_{k-2}) x_{k-1}}{f(x_{k-1}) - f(x_{k-2})} \]

Two difference from above:

  • previously we could assume that \(a<b\), but now we do not know the order of the various \(x_k\) values, and

  • the root is not necessarily bewtween the two most recent values, so we no longer have tht simple error bound. (In fact, we will see that the zero is typically surrounded two-thirds of the time!)

Instead, we use the magnitude of \(b-a\) which is now \(|x_k - x_{k-1}|\), and this is only an estimate of the error. This is the same as used for Newton’s Method; as there, it is still useful as a condition for ending the iterations and indeed tends to be pessimistic, so that we typically do one more iteration than needed — but it is not on its own a complete guarantee of having achieved the desired accuracy.

7.2.3. Pseduo-code for a Secant Method Algorithm

Input function \(f\), interval endpoints \(x_0\) and \(x_1\), an error tolerance \(E_{tol}\), and an iteration limit \(N\)

for k from 2 to N:
\(\quad\) \(\displaystyle x_k \leftarrow \frac{x_{k-2} f(x_{k-1}) - f(x_{k-2}) x_{k-1}}{f(x_{k-1}) - f(x_{k-2})}\)
\(\quad\) Evaluate the error estimate \(E_{est} \leftarrow |x_k - x_{k-1}|\)
\(\quad\) if \(E_{est} \leq E_{tol}\):
\(\quad\quad\) End the iterations
\(\quad\) else:
\(\quad\quad\) Go around another time
end for
Output the final \(x_k\) as the approximate root and \(E_{est}\) as an estimate of its absolute error.

7.2.4. Python Code for this Secant Method Algorithm

We could write Python code that closely follows this notation, accumulating a list of the values \(x_k\).

However, since we only ever need the two most recent values to compute the new one, we can instead just store these three, in the same way that we recylced the variables a, b and c. Here I use more descriptive names though:

def secant_method(f, a, b, errorTolerance=1e-15, maxIterations=15, demoMode=False):
    """Solve f(x)=0 in the interval [a, b] by the Secant Method."""
    if demoMode:
        print(f"Solving by the Secant Method.")        
    # Some more descriptive names
    x_older = a
    x_more_recent = b
    f_x_older = f(x_older)
    f_x_more_recent = f(x_more_recent)
    for iteration in range(maxIterations):
        if demoMode: print(f"\nIteration {iteration}:")
        x_new = (x_older * f_x_more_recent - f_x_older * x_more_recent)/(f_x_more_recent - f_x_older)
        f_x_new = f(x_new)
        (x_older, x_more_recent) = (x_more_recent, x_new)
        (f_x_older, f_x_more_recent) = (f_x_more_recent, f_x_new)
        errorEstimate = abs(x_older - x_more_recent)
        if demoMode:
            print(f"The latest pair of approximations are {x_older} and {x_more_recent},")
            print(f"where the function's values are {f_x_older:0.4} and {f_x_more_recent:0.4} respectively.")
            print(f"The new approximation is {x_new}, with estimated error {errorEstimate:0.4}, backward error {abs(f_x_new):0.4}")
        if errorEstimate < errorTolerance:
            break
    # Whether we got here due to accuracy of running out of iterations,
    # return the information we have, including an error estimate:
    return (x_new, errorEstimate)

Note: As above, you could omit the above def and instead import this function with

from numerical_methods_module import secant_method
(root, errorEstimate) = secant_method(f, a=-1, b=1, demoMode=True)
print(f"\nThe Secant Method gave approximate root is {root},")
print(f"with estimated error {errorEstimate:0.4}, backward error {abs(f(root)):0.4}")
Solving by the Secant Method.

Iteration 0:
The latest pair of approximations are 1 and 0.5403023058681398,
where the function's values are 0.4597 and -0.3173 respectively.
The new approximation is 0.5403023058681398, with estimated error 0.4597, backward error 0.3173

Iteration 1:
The latest pair of approximations are 0.5403023058681398 and 0.7280103614676172,
where the function's values are -0.3173 and -0.01849 respectively.
The new approximation is 0.7280103614676172, with estimated error 0.1877, backward error 0.01849

Iteration 2:
The latest pair of approximations are 0.7280103614676172 and 0.7396270126307336,
where the function's values are -0.01849 and 0.000907 respectively.
The new approximation is 0.7396270126307336, with estimated error 0.01162, backward error 0.000907

Iteration 3:
The latest pair of approximations are 0.7396270126307336 and 0.7390838007832722,
where the function's values are 0.000907 and -2.23e-06 respectively.
The new approximation is 0.7390838007832722, with estimated error 0.0005432, backward error 2.23e-06

Iteration 4:
The latest pair of approximations are 0.7390838007832722 and 0.7390851330557805,
where the function's values are -2.23e-06 and -2.667e-10 respectively.
The new approximation is 0.7390851330557805, with estimated error 1.332e-06, backward error 2.667e-10

Iteration 5:
The latest pair of approximations are 0.7390851330557805 and 0.7390851332151607,
where the function's values are -2.667e-10 and 0.0 respectively.
The new approximation is 0.7390851332151607, with estimated error 1.594e-10, backward error 0.0

Iteration 6:
The latest pair of approximations are 0.7390851332151607 and 0.7390851332151607,
where the function's values are 0.0 and 0.0 respectively.
The new approximation is 0.7390851332151607, with estimated error 0.0, backward error 0.0

The Secant Method gave approximate root is 0.7390851332151607,
with estimated error 0.0, backward error 0.0
def graph_secant_method(f, a, b, maxIterations=5):
    """Graph a few iterations of the Secant Method for solving f(x)=0 in the interval [a, b].
    """
    x_older = a
    x_more_recent = b
    f_x_older = f(x_older)
    f_x_more_recent = f(x_more_recent)
    for iteration in range(maxIterations):
        x_new = (x_older * f_x_more_recent - f_x_older * x_more_recent)/(f_x_more_recent - f_x_older)
        f_x_new = f(x_new)
        latest_three_x_values = [x_older, x_more_recent, x_new]
        left = np.min(latest_three_x_values)
        right = np.max(latest_three_x_values)
        x = linspace(left, right)
        figure(figsize=[16,6])
        title(f"Iteration {iteration+1}, Secant Method")
        xlabel("$x$")
        plot(x, f(x))
        plot([left, right], [f(left), f(right)])  # the secant line
        plot([left, right], [0, 0], 'k')  # the x-axis line
        plot(latest_three_x_values, f(latest_three_x_values), 'r*')
        show()  # The Windows version of JupytLab might need this command; it is harmless anyway.
        (x_older, x_more_recent) = (x_more_recent, x_new)
        (f_x_older, f_x_more_recent) = (f_x_more_recent, f_x_new)
        errorEstimate = abs(x_older - x_more_recent)
graph_secant_method(f, a=-1, b=1)
../_images/root-finding-without-derivatives-python_20_0.png ../_images/root-finding-without-derivatives-python_20_1.png ../_images/root-finding-without-derivatives-python_20_2.png ../_images/root-finding-without-derivatives-python_20_3.png ../_images/root-finding-without-derivatives-python_20_4.png

7.2.5. Observations

  • This converges faster than the Method of False Position (and far faster than Bisection).

  • The majority of iterations do have the root surrounded (sign-change in \(f\)), but every third one — the second and fifth — do not.

  • Comparing the error estimate to the backward error, the error estmte is in fact quite pessimistic (and so fairly trustworthy); in fact, it is typically of similar size to the backward error at the previous iteration.

The last point is a quite common occurence: the available error estimates are often “trailing indicators”, closer to the error in the previous approximation in an iteration. For example, recall that we saw the same thing with Newton’s Method when we used \(|x_k - x_{k-1}|\) to estimate the error \(E_k := x_k - r\) and saw that it is in fact closer to the previous error, \(E_{k-1}\).


This work is licensed under Creative Commons Attribution-ShareAlike 4.0 International