Solving Ax = b With Both Pivoting and LU Factorization
Contents
3.5. Solving With Both Pivoting and LU Factorization#
References:
Section 2.4 The PA=LU Factorization of [Sauer, 2019].
Section 6.5 Matrix Factorizations of [Burden et al., 2016].
Section 8.1 Matrix Factorizations of [Chenney and Kincaid, 2012].
Introduction#
The last step in producing an algorithm for solving the general case of
This is sometimes described in three parts:
permute (reorder) the rows of the matirx
by multiplying it at left by a suitable permutation matrix ; one with a single “1” in each row and each column and zeros elsewhere;Get the LU factorization of this matrix:
.To solve
Express as
(which just involves computing , which reorders the elements of )Solve
for by forward substitutionSolve
for by backward substitution: as before, this gives and , so ; since a permutation matrix is invertible (just unravel the row swaps), this ensures that .
This gives a nice formulas in terms of matrices; however we can describe it a bit more compactly and efficiently by just talking about the permutation of the rows, described by a permutation vector — an
(Aside: I use the conventional name
A number of details of this sketch will now be filled in, including the very useful fact that the permutation vector (or matrix) can be contsructed “on the fly”, as rows are swapped in partial pivoting.
Row swapping is all you need#
Let us look at maximal element partial pivoting, but described in terms of the entries of the factors
(For now, I omit what happens to the right-hand side vector
What happens if pivoting occurs at some stage
One might fear that the process has to start again from the top using the modified version of matrix
Example: what happens at stage 5 ( )?#
To see this with a concrete example consider what happens if at stage
A) Firstly, what happens to matrix
The previous steps of the LU factorization process only involved entries of
Thus, the only change here is to swap the entries of
Thus if we are working in Julia with
( A[5, 5:end], A[10, 5:end] ) = ( A[10, 5:end], A[5, 5:end] )
B) Next, look at the work done so far on
That just consists of the previous rows
Values already computed in
C) Finally, look at the work done so far on the multipiers
The values computed so far are the first four columns of
Likewise, the same is true in reverse: the new value of
The entries of
As this is again only for some columns — the first four — the swaps needed are:
or in Julia’a slice notation for an array
( L[5, 1:4], L[10, 1:4] ) = ( L[10, 1:4], L[5, 1:4] )
The general pattern#
The example above extends to all stages
Gathering the key formulas above, this part of the algorithm is
Algorithm 3.6
for k from 1 to n-1
end
Pseudo-code for LU factorization with row swapping (first version)#
Here I also adopt slice notation; for example,
Algorithm 3.7 (LU factorization with row swapping, I)
for k from 1 to n
end
But what about the right-hand side, ?#
One thing is missing from this strategy so far:
if we are solving with a given right-hand-side column vector
but with the LU factorization we need to keep track of these swaps for use later.
This turns out to mesh nicely with another detail: we can avoid actually copying array entries around by just keeping track of the order in which we use rows to get zeros in other rows.
Our goal will be a permutation vector
First use row
to get zeros in column 1 of the other rows.Then use row
to get zeros in column 2 of the remaining rows.…
To do this:
first, initialize an array
at stage
, if the pivot element is in row , swap the corresponding elements in (rather than swapping entire rows of arrays):
Introducing the name
This pattern persists through each row swap: instead of computing a succesion of updated versions of matrix
All references to entries of
The same applies to the array
All references to entries of
Finally, since these row swaps also apply to the right-hand side
All references to entries of
Pseudo-code for LU factorization with a permutation vector#
Algorithm 3.8 (LU factorization with row swapping, II)
Initialize the permutation vector,
for k from 1 to n
end
Remark 3.16
For the version with a permutation matrix
start with an array
that is the identity matrix, and thenswap its rows
at stage instead of swapping the entries of or the rows of and .
using LinearAlgebra: norm
using LinearAlgebra: ⋅ # For the dot product (the centered dot can be typed as \cdot then tab)
include("NumericalMethods.jl")
using .NumericalMethods: printmatrix
function plu(A; demomode=false)
# Compute the Doolittle PA=LU factorization of A —
# but with the permutation recorded as permutation vector, not as the permutation matrix P.
# Sums like $\sum_{s=1}^{k-1} l_{k,s} u_{s,j}$ are done as matrix products;
# in the above case, row matrix L[k, 1:k-1] by column matrix U[1:k-1,j] gives the sum for a give j,
# and row matrix L[k, 1:k-1] by matrix U[1:k-1,k:n] gives the relevant row vector.
n = size(A)[1] # gives the number of rows in the 2D array.
# Julia can use Greek letters (and in fact, UNICODE):
# to insert character π, type \pi, hit tab, and select "π" from the menu.
# Or just call it "perm" or such.
π = collect(1:n)
# Julia language note: function "collect" converts the abstract entity "1:n" into an array of numbers.
# Initialize U as the zero matrix;
# correct below the main diagonal, with the other entries to be computed below.
U = zeros(n,n)
# Initialize L as zeros;
# correct above the main diagonal, with the other entries to be computed below,
# including the ones on the diagonal.
L = zeros(n,n)
for k in 1:n-1
if demomode; println("k=$k"); end
# Find the pivot element in column k:
pivotrow = k
abs_u_ik_max = abs(A[π[k],k])
for row in k+1:n
abs_u_ik = abs(A[π[row],k])
if abs_u_ik > abs_u_ik_max
pivotrow = row
abs_u_ik_max = abs_u_ik
end
end
if pivotrow > k # swap rows, virtually
if demomode; println("Swap row $k with row $pivotrow"); end
(π[k], π[pivotrow]) = (π[pivotrow], π[k])
else
if demomode; println("No row swap needed."); end
end
U[k,k:end] = A[[π[k]],k:end] - L[[π[k]],1:k] * U[1:k,k:end]
L[π[k],k] = 1.
for row in k+1:n
L[π[row],k] = ( A[π[row],k] - L[π[row],1:k] ⋅ U[1:k,k] ) / U[k,k]
# Julia note: To enter the centered dot notation for the dot product, type "\cdot" and then hit the tab key.
end
if demomode
println("permuted A is:")
for row in 1:n
println(A[π[row],:])
end
println("Intermediate L is"); printmatrix(L)
println("Intermediate U is"); printmatrix(U)
end
end
# The last row (index "end") is special: nothing to do for L except put in the 1 on the "permuted main diagonal"
L[π[end],end] = 1.
U[end,end] = A[π[end],end] - L[π[end],1:end-1] ⋅ U[1:end-1,end]
if demomode
println("After the final step, k=$(n-1)")
println("L is"); printmatrix(L)
println("U is"); printmatrix(U)
end
return (L, U, π)
end;
A = [ 1.0 -3.0 22.0 ; 3.0 5.0 -6.0 ; 4.0 235.0 7.0 ]
println("A is"); printmatrix(A)
(L, U, π) = plu(A, demomode=true)
println("\nFunction plu gives")
println("L="); printmatrix(L)
println("U="); printmatrix(U)
println("row permution $(π)")
println("The 'residual' or 'backward error' A-LU is"); printmatrix(A - L*U)
A is
[ 1.0 -3.0 22.0
3.0 5.0 -6.0
4.0 235.0 7.0 ]
k=1
Swap row 1 with row 3
permuted A is:
[4.0, 235.0, 7.0]
[3.0, 5.0, -6.0]
[1.0, -3.0, 22.0]
Intermediate L is
[ 0.25 0.0 0.0
0.75 0.0 0.0
1.0 0.0 0.0 ]
Intermediate U is
[ 4.0 235.0 7.0
0.0 0.0 0.0
0.0 0.0 0.0 ]
k=2
No row swap needed.
permuted A is:
[4.0, 235.0, 7.0]
[3.0, 5.0, -6.0]
[1.0, -3.0, 22.0]
Intermediate L is
[ 0.25 0.3605839416058394 0.0
0.75 1.0 0.0
1.0 0.0 0.0 ]
Intermediate U is
[ 4.0 235.0 7.0
0.0 -171.25 -11.25
0.0 0.0 0.0 ]
After the final step, k=2
L is
[ 0.25 0.3605839416058394 1.0
0.75 1.0 0.0
1.0 0.0 0.0 ]
U is
[ 4.0 235.0 7.0
0.0 -171.25 -11.25
0.0 0.0 24.306569343065693 ]
Function plu gives
L=
[ 0.25 0.3605839416058394 1.0
0.75 1.0 0.0
1.0 0.0 0.0 ]
U=
[ 4.0 235.0 7.0
0.0 -171.25 -11.25
0.0 0.0 24.306569343065693 ]
row permution [3, 2, 1]
The 'residual' or 'backward error' A-LU is
[ 0.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 0.0 ]
Matrix
row
only involves (multiplied by 1) and so can be used to solve forrow
only involves and (the latter multiplied by 1) and so can be used to solve for…
Definition 3.6 (Psychologically [lower] triangular)
A matrix like this — one that is a row-permutation of a [lower] triangular matrix — is called psychologically [lower] triangular. (Maybe because it believes itself to be such?)
Forward and backward substitution with a permutation vector#
To solve
function forwardsubstitution(L, b, π)
# Version 2: with permutation of rows
# Solve L c = b for c, with permutation of the rows of L and of b.
n = length(b)
c = zeros(n)
c[1] = b[π[1]]
for i in 2:n
c[i] = b[π[i]] - L[π[i], 1:i] ⋅ c[1:i]
end
return c
end;
b = [2.0, 3.0, 4.0];
c = forwardsubstitution(L, b, π)
print("c = $c")
c = [4.0, 0.0, 1.0]
Then the final step, solving backwardSubstitution
seen previously
using .NumericalMethods: backwardsubstitution
x = backwardsubstitution(U, c)
println("x = $x")
Ax = A*x
r = b - A*x
println("The residual r = b - Ax is \n$r\nwith maximum norm $(norm(r, Inf))")
x = [1.0867867867867869, -0.002702702702702703, 0.04114114114114114]
The residual r = b - Ax is
[0.0, 0.0, 0.0]
with maximum norm 0.0