Partial Pivoting
Contents
3.3. Partial Pivoting#
References:
Section 2.4.1 Partial Pivoting of [Sauer, 2019].
Section 6.2 Pivoting Stratgies of [Burden et al., 2016].
Section 7.1 of [Chenney and Kincaid, 2012].
Some references describe the method of scaled partial pivoting, but here we present instead a version without the “scaling”, because not only is it simpler, but modern research shows that it is esentially always as good, once the problem is set up in a “sane” way.
Introduction#
The basic row reduction method can fail due to divisoion by zero (and to have very large rouding errors when a denominator is extremely close to zero. A more robust modification is to swap the order of the equations to avaid these problems: partial pivotng. Here we look at a particularly robust version of this strategy, Maximal Element Partial Pivoting.
What can go wrong with naive Gaussian elimination?#
We have noted two problems with the naive algorithm for Gaussian elimination: total failure due the division be zero, and loss of precision due to dividing by very small values — or more preciselt calculations the lead to intermediate values far larger than the final results. The culprits in all cases are the same: the denominators are first the pivot elements \(a_{k,k}^{(k-1)}\) in evaluation of \(l_{i,k}\) during row reduction and then the \(u_{k,k}\) in back substitution. Further, those \(a_{k,k}^{(k-1)}\) are the final updated values at indices \((k,k)\), so are the same as \(u_{k,k}\). Thus it is exactly these main diagonal elements that we must deal with.
The basic fix: partial pivoting#
The basic strategy is that at step \(k\), we can swap equation \(k\) with any equation \(i\), \(i > k\).
Note that this involves swapping those rows of array A
and also those elements of the array b
for the right-hand side: \(b_k \leftrightarrow b_i\).
This approach of swapping equations (swapping rows in arrays A
and b
) is called pivoting, or more specifically partial pivoting, to distinguish from the more elaborate strategy where to columns of A
are also reordered (which is equivalent to reordeting the unknowns in the equations).
The row that is swapped with row \(k\) is sometimes called the pivot row, and the new denominator is the corresponding pivot element.
This approach is robust so long as one is using exact arithmetic: it works for any well-posed system because so long as the \(Ax = b\) has a unique solution — so that the original matrix \(A\) is non-singular — at least one of the \(a_{i,k}^{(k-1)}, i \geq k\) will be non-zero, and thus the swap will give a new element in position \((k,k)\) that is non-zero. (I will stop caring about superscripts to distinguish updates, but if you wish to, the elements of the new row \(k\) could be called either \(a_{k,j}^{(k)}\) or even \(u_{k,j}\), since those values are in their final state.)
Handling rounding error: maximal element partial pivoting#
The final refinement is to seek the smallest possible magnitudes for intermediate values, and thus the smallest absolute errors in them, by making the multipliers \(l_{i,k}\) small, in turn by making the denominator \(a_{k,k}^{(k-1)} = u_{k,k}\) as large as possible in magnitude:
At step \(k\), choose the pivot row \(p_k \geq k\) so that \(|a_{p_k,k}^{(k-1)}| \geq |a_{i,k}^{(k-1)}|\) for all \(i \geq k\). If there is more that one such element of largest magnitude, use the lowest value: in particular, if \(p_k = k\) works, use it and do not swap!
A consequence of this is that the mutipliers used in the row operations all have \(|l_{i,k}| = \left| \frac{a_{p_i,k}^{(k-1)}}{a_{p_k,k}^{(k-1)}} \right| < 1.\)
(Swapping values in Julia)
In Julia (as in Python) the value of two variables a
and b
can be swapped via tuples with (a, b) = (b, a)
,
and combined with array slicing, this can also be used to swap rows (or columns)
a = 1
b = 2
(a, b) = (b, a)
(2, 1)
A = [11 12 13 ; 21 22 23 ; 31 32 33]
3×3 Matrix{Int64}:
11 12 13
21 22 23
31 32 33
(A[1,:], A[2,:]) = (A[2,:], A[1,:])
A
3×3 Matrix{Int64}:
21 22 23
11 12 13
31 32 33
See Exercise 1.
Some demonstrations#
No row reduction is done here, so entire rows are swapped rather than just the elements from column \(k\) onward:
First, get the matrix pretty-printer seen in Row Reduction/Gaussian Elimination; this time from Module NumericalMethods:
include("NumericalMethods.jl")
using .NumericalMethods: printmatrix
A = [1 -6 2 ; 3 5 -6 ; 4 2 7]
n = 3
println("Initially A is")
printmatrix(A)
Initially A is
[ 1 -6 2
3 5 -6
4 2 7 ]
k = 1
p = 3
temp = copy(A[k,:])
A[k,:] = A[p,:]
A[p,:] = temp
println("After swapping rows 1 <-> 3 using slicing and a temporary row, A is")
printmatrix(A)
After swapping rows 1 <-> 3 using slicing and a temporary row, A is
[ 4 2 7
3 5 -6
1 -6 2 ]
k = 2
p = 3
for j in 1:n
( A[k,j] , A[p,j] ) = ( A[p,j] , A[k,j] )
end
println("After swapping rows 2 <-> 3 using a loop and tuples of elements (no temp) A is:")
printmatrix(A)
After swapping rows 2 <-> 3 using a loop and tuples of elements (no temp) A is:
[ 4 2 7
1 -6 2
3 5 -6 ]
k = 1
p = 2
( A[k,:] , A[p,:] ) = ( copy(A[p,:]) , copy(A[k,:]) )
println("After swapping rows 1 <-> 2 using tuples of slices (no loop or temp) A is:")
printmatrix(A)
After swapping rows 1 <-> 2 using tuples of slices (no loop or temp) A is:
[ 1 -6 2
4 2 7
3 5 -6 ]
When is it safe to do without pivoting?#
Theorem 3.1 shows that diagonal dominance guarantees that pivoting is not necessary because the diagonal elements are never zero.
In the case of the matrix \(A\) being column-wise SDD as in Definition 3.2, the situation is even better; there is no reason for pivoting:
If matrix \(A\) is column-wise SDD, maximal element partial pivoting in fact does no row-swaps; it does the same thing as naive Gaussian elimination.
Being row-wise SDD is more “natural” and common than being column-wise SDD, because the former is a property “within” each of the equations that go into the matrix. This might seem unfortunate, but there is a way to get the benefits of the above nice result also for row-wise SDD matrices, which we will see in the section Solving Ax = b with LU factorization with the Crout decomposition.
Exercises#
Exercise 1#
Explain why we cannot just swap the relevant elements of rows \(k\) and \(p\) with:
for j in 1:n
A[k,j] = A[p,j]
A[p,j] = A[k,j]
end
or in vectorized form:
A[k,:] = A[p,:]
A[p,:] = A[k,:]
Describe what happens instead.