27. Simultaneous Linear Equations, Part 6: Iterative Methods¶

Revised on Wednesaday April 14, 2021 after class; just making a few corrections like fixing typos.

References:

Section 2.5 Iterative Methods in Sauer, Sub-sections 2.5.1 to 2.5.3.
Chapter 7 Iterative Techniques in Linear Algebra in Burden&Faires, Sections 7.1 to 7.3.
Section 8.4 in Chenney&Kincaid

27.1. Introduction¶

This topic is a huge area, with lots of ongoing research; this section just explores the first few methods in the field:

The Jacobi Method.
The Gauss-Seidel Method.

The next three major topics for further study are:

The Method of Succesive Over-Relaxation (“SOR”). This is usually done as a modification of the Gauss-Seidel method, though the strategy of “over-relaxation” can also be applied to other iterative methods such as the Jacobi method.
The Conjugate Gradient Method (“CG”). This is beyond the scope of this course; I mention it because in the realm of solving linear systems that arise in the solution of differential equations, CG and SOR are the basis of many of the most modern, advanced methods.
Preconditioning.

27.2. The Jacobi Method¶

The basis of the Jacobi method for solving \(Ax = b\) is splitting \(A\) as \(D + R\) where \(D\) is the diagonal of \(A\):

\[\begin{split}\begin{split} d_{i,i} &= a_{i,i} \\ d_{i,j} &= 0, \quad i \neq j \end{split}\end{split}\]

so that \(R = A-D\) has

\[\begin{split}\begin{split} r_{i,i} &= 0 \\ r_{i,j} &= a_{i, j}, \quad i \neq j \end{split}\end{split}\]

Visually

\[\begin{split} D = \left[ \begin{array}{cccc} a_{11} & 0 & 0 & \dots \\ 0 & a_{22} & 0 & \dots \\ 0 & 0 & a_{33} & \dots \\ \vdots & \vdots & \vdots & \ddots \end{array} \right] \end{split}\]

It is easy to solve \(Dx = b\): the equations are just \(a_{ii} x_i = b_i\) with solution \(x_i = b_i/a_{ii}\).

Thus we rewrite the equation \(Ax = Dx + Rx = b\) in the fixed point form

\[Dx = b - Rx\]

and then use the familiar fixed point iteration strategy of inserting the currect approximation at right and solving for the new approximation at left:

\[D x^{(k)} = b - R x^{(k-1)}\]

Note: We could make this look closer to the standard fixed-point iteration form \(x_k = g(x_{k-1})\) by dividing out \(D\) to get

\[x^{(k)} = D^{-1}(b - R x^{(k-1)}),\]

but — as is often the case — it will be better to avoid matrix inverses by instead solving this easy system. This “inverse avoidance” becomes far more important when we get to the Gauss-Seidel method!

27.2.1. Exercise 1: Implement and test the Jacobi method¶

Write and test Python functions for this.

A) As usual start with a most basic version that does a fixed number of iterations

x = jacobi_basic(A, b, n)

B) Then refine this to apply an error tolerance, but also avoiding infinite loops by imposing an upper limit on the number of iterations:

x = jacobi(A, b, errorTolerance, maxIterations)

Test this with the matrices of form \(T\) below for several values of \(n\), increasingly geometrically. To be cautious initially, try \(n=2, 4, 8, 16, \dots\)

27.3. The underlying strategy¶

To analyse the Jacobi method — answering questions like for which matrices it works, and how quickly it converges — and also to improve on it, it helps to described a key strategy underlying it, which is this: approximate the matrix \(A\) by another one \(E\) one that is easier to solve with, chosen so that the discrepacy \(R = A-E\) is small enough. Thus, repeatedly solving the new easier equations \(Ex^{(k)} = b^{(k)}\) plays a similar role to repeatedly solving tangent line approximations in Newton’s method.

Of course to be of any use, \(E\) must be somewhat close to \(A\); the remainder \(R\) must be small enough. We can make this requirement precise with the use of matrix norms and an upgrade of the contraction mapping theorem seen the section on fixed-point iteration.

Thus consider a general splitting of \(A\) as \(A = E + R\). As above, we rewrite \(Ax = Ex + Rx = b\) as \(Ex = b - Rx\) and thence as \(x = E^{-1}b - (E^{-1}R)x\). (It is alright to use the matrix inverse here, since we are not actually computing it; only using it for a theoretical argument!) The fixed point iteration form is thus

\[x^{(k)} = g(x^{(k-1)}) = c - S x^{(k-1)}\]

where \(c = E^{-1}b\) and \(S = E^{-1}R\).

For vector-valued functions we extend the previous definition as:

Definition 1: Vector-valued Contraction Mapping. For a set \(D\) of vectors in \(\mathbb{R}^n\), a mapping \(g:D \to D\) is called a contraction or contraction mapping if there is a constant \(C < 1\) such that

\[\|g(x) - g(y)\| \leq C \|x - y\|\]

for any \(x\) and \(y\) in \(D\). We then call \(C\) a contraction constant.

Next, also extend the Contraction Mapping Theorem to

Theorem 1. (Contraction Mapping Theorem for vector-valued functions).

Any contraction mapping \(g\) on a closed, bounded set \(D \in \mathbb{R}^n\) has exactly one fixed point \(p\) in \(D\).
This can be calculated as the limit \(\displaystyle p = \lim_{k \to \infty} x^{(k)}\) of the iteration sequence given by \(x^{(k)} = g(x^{(k-1)})\) for any choice of the starting point \(x^{(0)} \in D\).
The errors decrease at a guaranteed minimum speed: \(\| x^{(k)} - p \| \leq C \| x^{(k-1)} - p \|\), so \(\| x^{(k)} - p \| \leq C^k \| x^{(0)} - p \|\).

With this, it turns out that the above iteration converges if \(S\) is “small enough” in the sense that \(\|S\| = C < 1\) — and it is enough that this works for any choice of matrix norm!

Theorem 2. If \(S := E^{-1}R = E^{-1}A - I\) has \(\|S\| = C < 1\) for any choice of matrix norm, then the iterative scheme \(x^{(k)} = c - S x^{(k-1)}\) with \(c = E^{-1}b\) converges to the solution of \(Ax = b\) for any choice of the initial approximation \(x^{(0)}\). (Aside: the zero vector is an obvious and popular choice for \(x^{(0)}\).)

Incidentally, since this condition guarantees that there exists a unique solution to \(Ax=b\), it also shows that \(A\) is non-singular.

Main Proof Idea: For \(g(x) = c - S x\),

\[\|g(x) - g(y)\| = \| (c - S x) - (c - S y) \| = \| S(y - x) \| \leq \|S\| \|y-x\| \leq C \|x - y\|,\]

so with \(C < 1\), it is a contraction.

(The omitted more “technical” detail is to find a suitable bounded domain \(D\) that all the iterates x^{(k)} stay inside it.)

27.3.1. What does this say about the Jacobi method?¶

For the Jacobi method, \(E = D\) so \(E^{-1}\) is the diagonal matrix with elements \(1/a_{i,i}\) on the main diagonal, zero elsewhere. The product \(E^{-1}A\) then multiplies each row \(i\) of \(A\) by \(1/a_{i,i}\), giving

\[\begin{split} E^{-1}A = \left[ \begin{array}{cccc} 1 & a_{1,2}/a_{1,1} & a_{1,2}/a_{1,1} & \dots \\ a_{2,1}/a_{2,2} & 1 & a_{2,3}/a_{2,2} & \dots \\ a_{3,1}/a_{3,3} & a_{3,2}/a_{3,3} & 1 & \dots \\ \vdots & \vdots & \vdots & \ddots \end{array} \right] \end{split}\]

so that subtracting the identity matrix to get \(S\) cancels the ones on the main diagonal:

\[\begin{split} S = E^{-1}A - I= \left[ \begin{array}{cccc} 0 & a_{1,2}/a_{1,1} & a_{1,2}/a_{1,1} & \dots \\ a_{2,1}/a_{2,2} & 0 & a_{2,3}/a_{2,2} & \dots \\ a_{3,1}/a_{3,3} & a_{3,2}/a_{3,3} & 0 & \dots \\ \vdots & \vdots & \vdots & \ddots \end{array} \right] \end{split}\]

Here is one of many places that using the maximum-norm, a.k.a. \(\infty\)-norm, makes life much easier! Recalling that this is given by

\[ \| A \|_\infty = \max_{i=1}^n \left( \sum_{j=1}^n |a_{i,j}| \right), \]

First, sum the absolute values of elements in each row \(i\); with the common factor \(1/|a_{i,i}|\), this gives \( \left( |a_{i,1}| + |a_{i,2}| + \cdots |a_{i,i-1}| + |a_{i,i+1}| + \cdots |a_{i,n}| \right)/|a_{i,i}| \).
Such a sum, skipping index \(j=i\), can be abbreviated as

\[\left( \sum_{1 \leq j \leq n, j \neq i}|a_{i,j}| \right) \Big/ |a_{i,i}|\]

Then get the norm as the maximum of these:

\[ C = \| E^{-1}A \|_\infty = \max_{i=1}^n \left[ \left( \sum_{1 \leq j \leq n, j \neq i}|a_{i,j}| \right) \Big/ |a_{i,i}| \right] \]

and the contraction condition \(C < 1\) becomes the requirement that each of these \(n\) “row sums” is less than 1:

Multiplying each of the inequalities by the denominator \(|a_{i,i}|\) gives \(n\) conditions

\[\left( \sum_{1 \leq j \leq n, j \neq i}|a_{i,j}| \right) < |a_{i,i}|\]

In words, the magnitude of each main diagonal element is greater than the sum of the magnitudes of all other elements in that row. Intuitively, the elements of the main diagonal dominate the rest of their row in size; hence the name for such a matrix:

Definition 2: (Row-wise) Strictly Diagonally Dominant. A matrix \(A\) is Row-wise Strictly Diagonally Dominant (sometimes abbreviated as just Strictly Diagonally Dominant or even just SDD) if

\[\sum_{1 \leq j \leq n, j \neq i}|a_{i,j}| < |a_{i,i}|\]

Another way to think of this is that such a matrix \(A\) is close to its main diagonal \(D\), which is the intuitive condition that the approximation of \(A\) by \(D\) as done in the Jacobi method is “good enough”. And indeed, combining this result with Theorem 2 gives:

Theorem 3. The Jacobi Method converges if \(A\) is Strictly Diagonally Dominant, for any initial approximation \(x^{(0)}\).

Further, the error goes down by at least a factor of \(\| I - D^{-1}A \|\) at each iteration.

By the way, other matrix norms give other conditions guaranteeing convergence; perhaps the most useful of these others is that it is also sufficient for \(A\) to be Column-wise Strictly Diagonally Dominant — I leave it to you to guess the definition of that!

Aside: This theorem also shows that any strictly diagonally dominant matrix is non-singular; this can be a useful way to verify in advance that a system of equation has a unique solution, even if you plan to compute it by some other method like row reduction.

Further, it can be shown (as will be done in a proposed later section of this book!) that pivoting (row swapping) is not needed: the “dominance” of elements on the main diagonal persists, so that they are always good choices for the pivot element.

27.4. The Gauss-Seidel Method¶

To recap, two key ingredients for a splitting \(A = E + R\) to be useful are that

the matrix \(E\) is “easy” to solve with, and
it is not too far from \(A\).

The Jacobi method choice of \(E\) being the main diagonal of \(A\) strongly emphasizes the “easy” part, but we have seen another larger class of matrices for which it is fairly quick and easy to solve \(Ex = b\): triangular matrices, which can be solved with forward or backward substitution, not needing row reduction.

The Gauss-Seidel Method takes \(E\) be the lower triangular part of \(A\), which intuitively leaves more of its entries closer to \(A\) and makes the remainder \(R = A-E\) “smaller”.

To discuss this and other splittings, we write the matrix as \(A = L + D + U\) where:

\(D\) is the diagonal of \(A\), as for Jacobi
\(L\) is the strictly lower diagonal part of \(A\) (just the elements with \(i > j\))
\(U\) is the strictly upper diagonal part of \(A\) (just the elements with \(i < j\))

That is,

\[\begin{split} A = \left[ \begin{array}{cccc} a_{1,1} & a_{1,2} & a_{1,3} & \dots \\ a_{2,1} & a_{2,2} & a_{2,3} & \dots \\ a_{3,1} & a_{3,2} & a_{3,3} & \dots \\ \vdots & \vdots & \vdots & \ddots \end{array} \right] = \left[ \begin{array}{cccc} 0 & 0 & 0 & \dots \\ a_{2,1} & 0 & 0 & \dots \\ a_{3,1} & a_{3,2} & 0 & \dots \\ \vdots & \vdots & \vdots & \ddots \end{array} \right] + \left[ \begin{array}{cccc} a_{1,1} & 0 & 0 & \dots \\ 0 & a_{2,2} & 0 & \dots \\ 0 & 0 & a_{3,3} & \dots \\ \vdots & \vdots & \vdots & \ddots \end{array} \right] + \left[ \begin{array}{cccc} 0 & a_{1,2} & a_{1,3} & \dots \\ 0 & 0 & a_{2,3} & \dots \\ 0 & 0 & 0 & \dots \\ \vdots & \vdots & \vdots & \ddots \end{array} \right] = L + D + U \end{split}\]

Thus \(R = L+U\) for the Jacobi method.

So now we use \(E = L + D\), which will be called \(A_L\), the lower triangular part of \(A\), and the remainder is \(R = U\). The fixed point form becomes

\[A_L x = b - Ux\]

giving the fixed point iteration

\[A_L x^{(k)} = b - U x^{(k-1)}\]

Here we definitely do not use the inverse of \(A_L\) when calculating! Instead, solve with forward substitution.

However to analyse convergence, the mathematical form

\[x^{(k)} = A_L^{-1}b - (A_L^{-1}U) x^{(k-1)}\]

is useful: the iteration map is now \(g(x) = c - S x\) with \(c = (L + D)^{-1} b\) and \(S = (L + D)^{-1} U\).

Arguing as above, we see that convergence is guaranteed if \(\| (L + D)^{-1} U\| < 1\). However it is not so easy in general to get a formula for \(\| (L + D)^{-1} U\|\); what one can get is slightly disappointing in that, despite the \(R=U\) here being in some sense “smaller” than the \(R=L+U\) for the Jacobi method, the general convergence guarantee looks no better:

Theorem 3. The Gauss-Seidel method converges if \(A\) is Strictly Diagonally Dominant, for any initial approximation \(x^{(0)}\).

However, in practice the convergence rate as given by \(C = C_{GS} = \| (L + D)^{-1} U\|\) is often better than for the \(C = C_J = \| D^{-1} (L+U) \|\) for the Jacobi method.

Sometimes this reduces the number of iterations enough to outweigh the extra computational effort involved in each iteration and make this faster overall than the Jacobi method — but not always.

27.4.1. Exercise 2: Implement and test the Gauss-Seidel method, and compare to Jacobi¶

Do the two versions as above and use the same test cases.

Then compare the speed/cost of the two methods: one way to do this is by using Python’s “stop watch”, function time.time: see the description of Python module time in the Python manual.

27.5. A family of test cases, arising from boundary value problems for differential equations¶

The tri-diagonal matrices \(T\) of the form

\[\begin{split}\begin{split} t_{i,i} &= 1 + 2 h^2 \\ t_{i,i+1} = t_{i,i+1} &= -h^2 \\ t_{i,j} &= 0, \quad |i-j|> 1 \end{split}\end{split}\]

and variants of this arise in the solutions of boundary value problems for ODEs like

\[\begin{split}\begin{split} -u''(x) + K u &= f(x), \quad a \leq x \leq b \\ u(a) = u(b) &= 0 \end{split}\end{split}\]

and related problems for partial differential equations.

Thus these provide useful initial test cases — usually with \(h = (b-a)/n\).

This work is licensed under Creative Commons Attribution-ShareAlike 4.0 International

MATH 375. Elementary Numerical Analysis (with Python)