|
Minimization by coordinate descent |
|
|
|
- In this application we search a local minimizer xopt of a smooth nonlinear
function f on Rn.
This means
f(xopt) <=
f(x) for all x from some neighborhood of xopt.
Indeed the method only tries to find a stationary point of f, i.e.
a zero of ∇f(x). Should f be locally convex, then indeed this will
be a minimizer. Convexity however is not needed for this method.
We need f in C1 and the gradient is explicitly used since
the AlBaali-Fletcher stepsize technique is used.
We compute the gradient analytically here with exception of the predefined testcase 14,
where this is done using a high precision finite difference formula of 6th order,
which almost reaches machine precision.
For a little bit theory concerning the background look here.
- The coordinate directions are used in a cyclic manner as directions
of descent. We take the sign according to
the directional derivative.
- Beginning with an initial guess x0 we iterate:
xk+1 = xk - σk dk.
Here dk = +/- e(k mod(n))+1
with ∇f(xk)Tdk >= 0 ,
where ej is the j-th coordinate unit vector.
- The stepsize σk is found by
the method published by Albaali and Fletcher (J.O.T.A. 1985).
This technique is a bracketing method using quadratic and cubic interpolation aiming
in satisfying
the strong Powell-Wolfe conditions (we omit the index k):
|∇ f(x_next)'d| <= κ*|∇ f(x)'d|
f(x)-f(x_next) >= σ*δ*|∇ f(x)'d|.
Here δ and κ are two parameters with default
values 0.01 and 0.8. There always must hold
0 < δ < κ < 1
A very small κ means an almost exact unidimensional minimization
in the direction -dk.
We use the minimizer of the quadratic polynomial in σ defined by
the values of f for σ=0 and
σ=1 and the directional derivative
-∇ f(x)'d at σ=0 in order to compute a first guess
for σk.
One can show that this estimate tends to the first local minimizer
and satisfies the strong Powell-Wolfe conditions provided
0 < δ < 1/2 < κ < 1 .
- In the quadratic case
f(x) = 1/2xTAx - bTx
with a positive definite A this gives the exact unidimensional
minimizer and the method reduces to the Gauss-Seidel linear solver
for Ax = b,
- We terminate the iteration if
- the user function returns an error indicator with σ reduced
below its lower bound (10-10)
- if the maximum number of steps have been performed (4000)
- if the stepsize algorithm found no progress for all n directions
in succession
- if f(x)-f(xnext) <= εf*(|f(xnext|+εf)
for 2n successive steps. εf=10-12 : no useful
progress obtainable.
- if for n successive steps the relative change in the coordinates of
x was less than εx*(||x||+εx) and the corresponding
gradient component less than max{εg,machine-epsilon=2.2*10-16}
(This is the ''good'' type of termination)
- If the level set of f for x0 is bounded then this method
will find a stationary point (to the precision allowed by the machine precision and
the parameter settings)
|
|
|
Input |
|
- You have the choice between 14 predefined functions for which you can read a short
examples description here.
You also may define a function f of your own,
depending on n variables x(1),...,x(n).
You also have the choice to specify the initial guess x0.
- Either you use the standard parameter settings for the termination criteria or you specify these
yourself. For convenience we have these reduced to εx, δ, κ setting εg=εx.
- For the predefined cases you have the possibility to replace the standard initial point by
another one. The required dimension can be drawn from the input form.
- You have the possibility to require a contourplot of f around the best point found
with two of the n variables varying and the other components of x fixed at their
current values. you can specify the size of the region. Be careful! Often these functions show
rapid growth such that on a large region you may ''see'' nothing useful.
|
|
|
Output |
|
- cpu time used. If this appears as zero this means less than 0.01 seconds.
- the termination reason.
- The computed best values f(xopt) and
x(1)opt,...,x(n)opt and the gradient components
(each one referring to one of the last n steps).
- The number of function and gradient component evaluations used.
- Two plots in half-log format showing fk - fopt
and ||∇ f(xk(i(k)))||.
- The contour plot, if requested.
|
|
|
|