12. Simultaneous Linear Equations, Part 4: Solving \(Ax = b\) With Both Pivoting and LU factorization¶

Updated on Wednesday February 17, with pseudo-code, Python code, and an example.

References:

Section 2.4 The PA=LU Factorization of Sauer
Section 6.5 Matrix Factorizations of Burden&Faires, from Permutation Matrices onward.
Section 8.1 Matrix Factorizations of Chenney&Kincaid

12.1. Introduction¶

The last step in producing an algorithm for solving the general case of \(n\) simultaneous linear equations in \(n\) variables that is robust, efficient and with good control of rounding error is to combine the ideas of partial pivoting (reordering the equations to avoid small pivot elements) and the LU factorization.

This is sometimes described in three parts:

permute (reorder) the rows of the matirx \(A\) by multiplying it at left by a suitable permutation matrix \(P\); one with a single “1” in each row and each column and zeros elsewhere;
Get the LU factorization of this matrix: \(PA = LU\).
To solve \(A x = b\)
- Express as \(P A x = L U x = P b\) (whic just involves computing \(Pb\), which just reorders the elements of \(b\))
- Solve \(L c = P b\) for \(c\) by forward substitution
- Solve \(U x = c\) for \(x\) by backward substitution: as before, this gives \(L U x = L c = P b\) and \(L U x = P A x\), so \(PAx = Pb\); since a permutation matrix \(P\) is invertible (just unravel the row swaps), this ensures that \(Ax = b\).

This give anice formuasin terms of matrices; however we can describe it a bit more compactly and efficiently by just talking about the permutation of the rows, described by a permutation vector — an \(n\) component vector \(\pi = [\pi_1, \pi_2 , \dots, \pi_n]\) whose elements are the integers from 1 to \(n\) in some order. So that is how the algorithm will be described below.

(Aside: I use the conventional name \(\pi\) for a permutation vector, partly to distinguish from the notation \(p_i\) used for pivot rows; however, feel free to use the name \(p\) insread, especially in Python code!)

A number of details of this sketch will now be filled in, including the very useful fact that the permutation vector (or matrix) can be contsructed “on the fly”, as rows are swapped in partial pivoting.

12.2. Row swapping is all you need¶

Let us look at maximal element partial pivoting, but described in terms of the entries of the factors \(L\) and \(U\), and updating matrix \(A\) with a succession of row swaps.

(For now, I omit what happens to the right-hand side vector \(b\); that is where the permutation vecor \(p\) will come in, as addressed below.)

What happens if pivoting occurs at some stage \(k\), with swapping of row \(k\) with a row \(p_k > 5\)?

One might fear that the process has to start again from the top using the modified version of matrix \(A\), but in fact all previous work can be reused, just swapping those rows “everywhere”.

12.2.1. Example: what happens at stage 5 (\(k=5\))?¶

To see this with a concrete example consider what happens if at stage \(k=5\) we swap rows 5 and 10 of \(A\).

A) Firstly, what happens to matrix \(A\)?

The previous steps of the LU factorization process only involved entries of \(A\) in its first four rows and first four columns, and this row swap has no effect of them. Likewise, in row reduction, changes at and below row \(k=5\) have no effect on the first four rows of the row reduced form, \(U\).

Thus, the only change here is to swap the entries of \(A\) between rows 5 and 10. What is more, the subsequent calculations only involve columns of index \(j=5\) upwards, so in fact we only need to update those entries. This can be written as

\[ a_{5, j} \leftrightarrow a_{10, j}, \quad 5 \leq j \leq n \]

Thus if we are working in Python with \(A\) stored in a numpy array, the update is the slice operation

( A[5, 5:], A[10, 5:] ) = ( A[10, 5:], A[5, 5:] )

(except for that pesky Pythonic down-shifting of indices; to be seen in pseudo-code later!)

B) Next, look at the work done so far on \(U\).

That just consists of the previous rows \(1 \leq i \leq 4\), and the swapping of rows 5 with 10 has no effect up there:

Values already computed in \(U\) are unchanged.

C) Finally, look at the work done so far on the multipiers \(l_{i,j}\); that is, matrix \(L\).

The values computed so far are the first four columns of \(L\); the multiples \(l_{i,j}\), \(1 \leq j \leq 4\) of row \(j\) subtracted from row \(i > j\). These do change: for example, the multiple \(l_{5,2}\) of row \(2\) is now subtracted from what was row 5 but is now row 10: thus, the new value of \(l_{10,2}\) is the previous value of \(l_{5,2}\).

Likewise, the same is true in reverse: the new value of \(l_{5,2}\) is the previous value of \(l_{10,2}\). This applies for all of the first four rows, so second index \(1 \leq j \leq 4\):

The entries of \(L\) computed so far are swapped between rows 5 and 10, leaving the rest unchanged.

As this is again only for some columns — the first four — the swaps needed are:

\[ l_{5, j} \leftrightarrow l_{10, j}, \quad 1 \leq j \leq 4 \]

or in Python slice notation for an array \(L\):

( L[5, :4], L[10, :4] ) = ( L[10, :4], L[5, :4] )

12.2.2. The general pattern¶

The example above extends to all stages \(k\) of row reduction or computing the LU factorization or a permute versio of matrix \(A\), where we adjust the pivot element at position \((k, k)\) by first swapping row \(k\) with a row \(p_k, \geq k\). (Allowing that sometimes no swap is needed, so that \(p_k = k\).)

Gathering the key formulas above, this part of the algorithm is

for k from 1 to n-1:
\(\quad\) Find the pivot row \(p_k, \geq k\).
\(\quad\) if \(p_k > k\):
\(\quad\quad\) Swap \(l_{k, j} \leftrightarrow l_{p_k, j}, \quad 1 \leq j < k \)
\(\quad\quad\) Swap \(a_{k, j} \leftrightarrow a_{p_k, j}, \quad k \leq j \leq n \)
\(\quad\) end for
end for

12.2.3. Pseudo-code for LU factorization with row swapping (first version)¶

Here I also adopt slice notation; for example, \(a_{k, k:n}\) denotes the slice \([a_{k, k} \dots a_{k, n}]\).

for k from 1 to n
\(\quad\) Find the pivot element:
\(\quad\) \(p = k\) \(\quad\) (p will be the index of the pivot row)
\(\quad\) for i from k+1 to n:
\(\quad\)\(\quad\) if |u_{i, k}| > |u_{p, k}|:
\(\quad\)\(\quad\)\(\quad\) p = i
\(\quad\)\(\quad\) end if
\(\quad\) end for
\(\quad\) if p > k: \(\quad\) (Swap rows)
\(\quad\)\(\quad\) \(l_{k, 1:k-1} \leftrightarrow l_{p, 1:k-1}\)
\(\quad\)\(\quad\) \(a_{k, k:n} \leftrightarrow a_{p, k:n}\)
\(\quad\) end if
\(\quad\) for j from k to n: \(\quad\) (Get the non-zero elements in row \(k\) of \(U\))
\(\quad\quad\) \(u_{k,j}=a_{k,j}-\sum_{s=1}^{k-1}l_{k,s}u_{s,j}\)
\(\quad\) end for
\(\quad\) for i from k+1 to n \(\quad\) (Get the non-zero elements in column \(k\) of \(L\) — except the 1’s on its diagonal)
\(\quad\quad\) \(l_{i,k}=\displaystyle\frac{a_{i,k}-\sum_{s=1}^{k-1}l_{i,s}u_{s,k}}{u_{k,k}}\)
\(\quad\) end for
end for

12.2.4. But what about the right-hand side, \(b\)?¶

One thing is missing from this strategy so far: if we are solving with a given right-hand-side column vector \(b\), we would also swap its rows at each stage, with

\[ b_k \leftrightarrow b_{p_k} \]

but with the LU factorization we need to keep track of these swaps for use later.

This turns out to mesh nicely with another detail: we can avoid actually copying array entries around by just keeping track of the order in which we use rows to get zeros in other rows. Our goal will be a permutation vector \(\pi = [\pi_1, \pi_2, \dots \pi_n]\) which says:

First use row \(\pi_1\) to get zeros in column 1 of the \(n-1\) other rows.
Then use row \(\pi_2\) to get zeros in column 2 of the \(n-2\) remaining rows.
…

To do this:

first, initialize an array \(\pi = [1, 2, \dots, n]\)
at stage \(k\), if the pivot element is in row \(p_k \neq k\), swap the corresponding elements in \(\pi\) (rather than swapping entire rows of arrays):

\[\pi_k \leftrightarrow \pi_{p_k}\]

Introducing the name \(A'\) for the new version of matrix \(A\), its row \(k\) has entries \(a'_{k, j} = a_{\pi_k, j}\).

This pattern persists through each row swap: instead of computing a succesion of updated versions of matrix \(A\), we leave it alone and just change the row indices:

All references to entries of \(A\) are now done with permuted row index: \(a_{\pi_i, j}\)

The same applies to the array \(L\) of multipliers:

All references to entries of \(L\) are now done with \(l_{\pi_i, j}\).

Finally, since these row swaps also apply to the right-hand side \(b\), we do the same there:

All references to entries of \(b\) are now done with \(b_{\pi_i}\).

12.2.5. Pseudo-code for LU factorization with a permutation vector¶

Initialize the permutation vector, \(\pi \leftarrow [1, 2, \dots, n]\)

for k from 1 to n
\(\quad\) Find the pivot element:
\(\quad\) \(p \leftarrow k\) \(\quad\) (p will be the index of the pivot row)
\(\quad\) for i from k+1 to n:
\(\quad\)\(\quad\) if \(|u_{i, k}| > |u_{p, k}|\):
\(\quad\)\(\quad\)\(\quad\) \(p \leftarrow i\)
\(\quad\)\(\quad\) end if
\(\quad\) if p > k: \(\quad\) (Just swap indices, not rows)
\(\quad\)\(\quad\) \(\pi_k \leftrightarrow \pi_p\)
\(\quad\) end if
\(\quad\) for j from k to n: \(\quad\) (Get the non-zero elements in row \(k\) of \(U\))
\(\quad\quad\) \(u_{k,j} \leftarrow a_{k,j}-\sum_{s=1}^{k-1}l_{k,s}u_{s,j}\)
\(\quad\) end for
\(\quad\) for i from k+1 to n: \(\quad\) (Get the non-zero elements in column \(k\) of \(L\) — except the 1’s on its diagonal)
\(\quad\quad\) \(l_{i,k} \leftarrow \displaystyle\frac{a_{i,k}-\sum_{s=1}^{k-1}l_{i,s}u_{s,k}}{u_{k,k}}\)
\(\quad\) end for
end for

Aside: For the version with a permutation matrix \(P\), instead:

start with an array \(P\) that is the identity matrix, and then
swap its rows \(k \leftrightarrow p_k\) at stage \(k\) instead of swapping the entries of \(\pi\) or the rows of \(A\) and \(L\).

from numpy import array, zeros_like

def plu(A, demoMode=False):
    """Compute the Doolittle PA=LU factorization of A —
    but with the permutation recorded as permutation vector, not as the permutation matrix P.
    Sums like $\sum_{s=1}^{k-1} l_{k,s} u_{s,j}$ are done as matrix products;
    in the above case, row matrix L[k, 1:k-1] by column matrix U[1:k-1,j] gives the sum for a give j,
    and row matrix L[k, 1:k-1] by matrix U[1:k-1,k:n] gives the relevant row vector.
    
    NOT WORKING YET!
    """
    n = len(A)  # len() gives the number of rows in a 2D array.
    perm = array(range(n))
    # Initialize U as the zero matrix;
    # correct below the main diagonal, with the other entries to be computed below.
    U = zeros_like(A)
    # Also, initialize L as the zero matrix;
    # the 1's will also be filled in as we go.
    L = zeros_like(A)
    for k in range(n-1):
        if demoMode: print(f"{k=}")

        # Find the pivot element in column k:
        pivot_row = k
        abs_u_ik_max = abs(A[perm[k],k])
        for row in range(k+1, n):
            abs_u_ik = abs(A[perm[row],k])
            if abs_u_ik > abs_u_ik_max:
                pivot_row = row
                abs_u_ik_max = abs_u_ik
        if pivot_row > k: # "swap"
            if demoMode: print(f"Swap row {k} with row {pivot_row}")
            (perm[k], perm[pivot_row]) = (perm[pivot_row], perm[k])
        else:
            if demoMode: print(f"No row swap needed.")
        U[k,k:] = A[perm[k],k:] - L[perm[k],:k] @ U[:k,k:]
        L[perm[k],k] = 1. 
        for row in range(k+1,n):
            L[perm[row],k] = ( A[perm[row],k] - L[perm[row],:k] @ U[:k,k] ) / U[k,k]
        if demoMode:
            print(f"permuted A is:")
            for row in range(n):
                print(A[perm[row],:])
            print(f"intermediate U is\n{U}")
            print(f"intermediate L is\n{L}")
            print(f"perm={perm}")
    # The last row (index "-1") is special: nothing to do for L except put in the 1 on the "permuted main diagonal"
    U[n-1,n-1] = A[perm[n-1],n-1] - sum(L[perm[n-1],:n-1]*U[:n-1,n-1])
    L[perm[n-1],n-1] = 1. 
    if demoMode:
        print(f"After the final step, k={n-1}")
        print(f"U=\n{U}")
    return (L, U, perm)

A = array([[1. , -3. , 22.], [3. , 5. , -6.], [4. , 235. , 7.], ])
print(f"A=\n{A}")
(L, U, perm) = plu(A, demoMode=True)
print("\nFunction plu gives")
print(f"row permution {perm}")
print(f"L=\n{L}")
print(f"U=\n{U}")
print(f"The 'residual' A - LU is \n{A - L@U}")

A=
[[  1.  -3.  22.]
 [  3.   5.  -6.]
 [  4. 235.   7.]]
k=0
Swap row 0 with row 2
permuted A is:
[  4. 235.   7.]
[ 3.  5. -6.]
[ 1. -3. 22.]
intermediate U is
[[  4. 235.   7.]
 [  0.   0.   0.]
 [  0.   0.   0.]]
intermediate L is
[[0.25 0.   0.  ]
 [0.75 0.   0.  ]
 [1.   0.   0.  ]]
perm=[2 1 0]
k=1
No row swap needed.
permuted A is:
[  4. 235.   7.]
[ 3.  5. -6.]
[ 1. -3. 22.]
intermediate U is
[[   4.    235.      7.  ]
 [   0.   -171.25  -11.25]
 [   0.      0.      0.  ]]
intermediate L is
[[0.25       0.36058394 0.        ]
 [0.75       1.         0.        ]
 [1.         0.         0.        ]]
perm=[2 1 0]
After the final step, k=2
U=
[[   4.          235.            7.        ]
 [   0.         -171.25        -11.25      ]
 [   0.            0.           24.30656934]]

Function plu gives
row permution [2 1 0]
L=
[[0.25       0.36058394 1.        ]
 [0.75       1.         0.        ]
 [1.         0.         0.        ]]
U=
[[   4.          235.            7.        ]
 [   0.         -171.25        -11.25      ]
 [   0.            0.           24.30656934]]
The 'residual' A - LU is 
[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

Matrix \(L\) is not actually lower triangular, due to the permutation of its rows, but is still fine for a version of forward substition, because

row \(\pi_1\) only involves \(x_1\) (multiplied by 1) and so can be used to solve for \(x_1\)
row \(\pi_2\) only involves \(x_1\) and \(x_2\) (the latter multiplied by 1) and so can be used to solve for \(x_2\)
…

Aside: Such a matrix is called psychologically lower triangular — maybe because it believes itself to be such?

12.2.6. Forward and backward substitution with a permutation vector¶

To solve \(L c = b\), all one has to change from the formulas for forward substitution seen in the previous section on the LU factorization is to put the permuted row index \(\pi_i\) in both \(L\) and \(b\):

\[ c_i = b_{\pi_i} - \sum_{j=1}^{i-1} l_{\pi_i,j} c_j,\; 1 \leq i \leq n \]

def forwardSubstitution(L, b, perm):
    """Solve L c = b for c, ith permutation of the rows of L and of b."""
    n = len(b)
    c = zeros_like(b)
    c[0] = b[perm[0]]
    for i in range(1, n):
        c[i] = b[perm[i]] - L[perm[i], :i] @ c[:i]
    return c

b = array([2., 3., 4.])

print(f"b = {b}")

b = [2. 3. 4.]

c = forwardSubstitution(L, b, perm)
print(f"c={c}")

c=[4. 0. 1.]

Then the final step, solving \(Ux = b\) for \(x\), needs no change, because \(U\) had no rows swapped, so we are done; we can import the function backwardSubstitution seen previously

from numerical_methods_module import backwardSubstitution

x = backwardSubstitution(U, c)
print(f"x={x}")
r = b - A@x
print(f"The residual r = b - Ax is \n{r}, with maximum norm {max(abs(r))}")

x=[ 1.08678679 -0.0027027   0.04114114]
The residual r = b - Ax is 
[0. 0. 0.], with maximum norm 0.0

This work is licensed under Creative Commons Attribution-ShareAlike 4.0 International

MATH 375. Elementary Numerical Analysis (with Python)