2. Solving Nonlinear Systems of Equations by generalizations of Newton’s Method — a Brief Introduction¶

Version of April 13, 2021

References:

Section 2.7 Nonlinear Systems of Equations of Sauer; in particular Sub-section 2.7.1 Multivariate Newton’s Method.
Chapter 10 Numerical Solution of Nonlinear Systems of Equations of Burden&Faires; in particular Sections 10.1 and 10.2.

2.1. Background¶

A system of simultaneous nonlinear equations

\[ F(x) = 0, \; F:\mathbb{R}^n \to \mathbb{R}^n \]

can be solved by a variant of Newton’s method.

Aside on notation: For the sake of emphasise analogies between results for vector-vslued quantities and what we have already seen for real- or compex-valued quantities, severl notiot chices are made in thes notes:

Using the same notation that vectors as for real or complex numbers (no over arrows or bold face).
Often denoting the derivative of function $f$ as $Df$ rather than $f'$, and higher derivatives as $D^p f$ rather than $f^{(p)}$. Partial derivatives of $f(x, y, \dots)$ are $D_x f$, $D_y f$, etc., and for vector arguments $x = (x_1, \dots , x_j, \dots x_n)$, they are $D_{x_1} f, \dots , D_{x_j} f , \dots , D_{x_n} f$, or more concisely, $D_1 f \dots , D_j f , \dots ,D_n f$. (This also fits better with Python code notation — even for partial derivatives.)
Subscripts will mostly be reserved for components of vectors, labelling the terms of a sequence with superscripts, $x^{(k)}$ and such.
Explicit division is avoided.

However, I use capital letters for vector-valued functions, for analogy to the use of capital letter for matrices.

Rewriting Newton’s method according to this new style, $$x^{(k+1)} = x^{(k)} - \frac{f(x^{(k)})}{Df(x^{(k)})}$$

or to avoid explicit division and introducing the useful increment $\delta^{(k)} := x^{(k+1)} - x^{(k)}$,

\[Df(x^{(k)}) (\delta^{(k)}) = f(x^{(k)}), \quad x^{(k+1)} = x^{(k)} + \delta^{(k)}.\]

2.2. Newton’s method iteration formula for systems¶

For vector valued functions, we will see in a while that an analogous result is true:

\[ (D \textbf{F}(x^{(k)}) ({\delta}^{(k)}) = \textbf{F}(x^{(k)}), \quad x^{(k+1)} = x^{(k)} + \delta^{(k)} \]

where $DF(x)$ is the $n \times n$ matrix of all the partial derivatives $(D_{x_j} F_i)(x)$ or $(D_j F_i)(x)$, where $x = (x_1, x_2, \dots, x_n).$

2.2.1. Justification: linearization for function of several variables¶

To justify the above result, we need at least a case of Taylor’s Theorem for functions of several variables, for both $f:\mathbb{R}^n \to \mathbb{R}$ and $F:\mathbb{R}^n \to \mathbb{R}^n$; just for linear approximations. This material from multi-variable calculus will be reviewed when we need it.

Warning: although mathematically this can be written with matrix inverses as

\[ \textbf{X}^{(k+1)} = \textbf{X}^{(k)} - (D \textbf{F}(\textbf{X}^{(k)})^{-1}(\textbf{F}(\textbf{X}^{(k)}) \]

evaluation of the inverse is in general about three times slower than solving the linear system, so is best avoided. (We will soon see a “compromise”; the LU factorization of a matrix.)

Even avoiding matrix inversion, this involves repeatedly solving systems of $n$ simultaneous linear equations in $n$ unknowns, $A x = b$, where the matrix $A$ is $D\textbf{F}(x^{(k)})$, and that will be seen to involve about $n^3/3$ arithmetic operations.

It also requires computing the new values of these $n^2$ partial derivatives at each iteration, also potentially with a cost proportional to $n^3$.

When $n$ is large, as is common with differential equations problems, this factor of $n^3$ indicates a potentially very large cost per iteration, so various modifications have been developed to reduce the computational cost of each iteration (with the trade-off being that more iterations are typically needed): so-called quasi-Newton methods.

This work is licensed under Creative Commons Attribution-ShareAlike 4.0 International

MATH 375. Elementary Numerical Analysis (with Python)

2. Solving Nonlinear Systems of Equations by generalizations of Newton’s Method — a Brief Introduction¶

2.1. Background¶

2.2. Newton’s method iteration formula for systems¶

2.2.1. Justification: linearization for function of several variables¶