No Title

The Bunch-Parlett Decomposition

The classical

{LDL}^{T}

decomposition of a symmetric matrix

A

is defined by

A = L D L^{T}

with a lower triangular matrix

L

with unit diagonal and a diagonal matrix

D

. This can be computed by the following compact algorithm:

\begin{matrix} i = & 1, \dots, n \\ d_{ii} = a_{ii} - \sum_{k = 1}^{i - 1} (l_{ik})^{2} d_{kk}, l_{ii} : = 1 \\ j = i + 1, \dots, n : \\ l_{ji} = (a_{ji} - \sum_{k = 1}^{i - 1} l_{ik} l_{jk} d_{kk}) / d_{ii} \end{matrix}

For a positive definite matrix (

A

is positive definite if and only if all the

d_{ii}

are strictly positive) this is a perfectly backward stable algorithm. It is associated with the Cholesky decomposition: the so called Cholesky factror is simply

{LD}^{1 / 2}

. As long as none of the

d_{ii}

becomes zero this can be performed also for indefinite matrices, but it is numerically unstable. One never should use it in this situation. The alternative is the Bunch-Parlett decomposition, which is stable also in the indefinite case and which discerns from the above twofold: the

D

-part is now block diagonal with

1 \times 1

and

2 \times 2

blocks, and the introduction of a symmetric permutation is indispensable: it reads

P^{T} A P = L D L^{T}

Here

P

denotes the permutation matrix. We describe here the version with complete pivoting. The total algorithm consists of up to

n - 1

steps which themselves decompose into two parts. The first part searches for the pivot: this is the element of largest absolute value in the current right lower submatrix of dimension

(n - i + 1) \times (n - i + 1)

. If this occurs on the diagonal, it becomes a

1 \times 1

block in

D

and is permuted to the position

(i, i)

in the current matrix. Then a normal Gaussian elimination step is performed, with the multipliers stored in column

i

and

i

is increased by one.

If however the element of largest absolute value appears in position

(j, k)

(with

j, k \geq i

and

k > j

) then the submatrix

(\begin{matrix} a_{j, j} & a_{j, k} \\ a_{k, j} & a_{k, k} \end{matrix})

becomes a

2 \times 2

block in

D

and the rows

i

and

j

and

i + 1

and

k

are swapped and equally the columns, of course. Hence we have now to perform a block elimination with a

2 \times 2

block. This uses the formula

(\begin{matrix} B_{1, 1} & B_{1, 2} \\ B_{2, 1} & B_{2, 2} \end{matrix}) = (\begin{matrix} I & O \\ B_{2, 1} B_{1, 1}^{- 1} & I \end{matrix}) (\begin{matrix} B_{1, 1} & O \\ O & C_{2, 2} \end{matrix}) (\begin{matrix} I & B_{1, 1}^{- 1} B_{1, 2} \\ O & I \end{matrix})

with

C_{2, 2} = B_{2, 2} - B_{2, 1} B_{1, 1}^{- 1} B_{1, 2} .

The two columns of the left factor go into

L

and the new remaining submatrix to be processed is

C_{2, 2}

, also known as the Schurcomplement. In this case

i

is increased by 2.

Finally, from Sylvesters theorem of inertia, the matrix

A

has as many negative eigenvalues as

D

has

2 \times 2

blocks and negative

1 \times 1

blocks, since each

2 \times 2

block has exactly one positive and one negative eigenvalue by construction. This decomposition is used here in a twofold manner: by shifting all eigenvalues of

D

above some level

γ > 0

by addition of a multiple of the identity matrix we construct a positive definite regularization of the Hessian,

L (D + μ I) L^{T} = \tilde{H}

and a strongly gradient related direction of descent

d

\tilde{H} d = - \nabla f (x)

and a direction of negative curvature

z

from

L P^{T} z = y

where

y

is composed from the eigenvectors corresponding to negative eigenvalues of

D

. We have namely

z^{T} A z = y^{T} D y .

We normalize

z

such that

z^{T} \nabla f (x) \leq 0

and use

z

as a direction of descent if this promises better progress than

d

File translated from T_EX by T_TM Unregistered, version 4.03.
On 16 Jun 2016, 18:51.