Minimization by coordinate descent

Directly to the input form

 
  • In this application we search a local minimizer xopt of a smooth nonlinear function f on Rn. This means
    f(xopt) <= f(x) for all x from some neighborhood of xopt. Indeed the method only tries to find a stationary point of f, i.e. a zero of ∇f(x). Should f be locally convex, then indeed this will be a minimizer. Convexity however is not needed for this method. We need f in C1 and the gradient is explicitly used since the AlBaali-Fletcher stepsize technique is used. We compute the gradient analytically here with exception of the predefined testcase 14, where this is done using a high precision finite difference formula of 6th order, which almost reaches machine precision. For a little bit theory concerning the background look here.
  • The coordinate directions are used in a cyclic manner as directions of descent. We take the sign according to the directional derivative.
  • Beginning with an initial guess x0 we iterate: xk+1 = xk - σk dk.
    Here dk = +/- e(k mod(n))+1 with ∇f(xk)Tdk >= 0 , where ej is the j-th coordinate unit vector.
  • The stepsize σk is found by the method published by Albaali and Fletcher (J.O.T.A. 1985). This technique is a bracketing method using quadratic and cubic interpolation aiming in satisfying the strong Powell-Wolfe conditions (we omit the index k):
    |∇ f(x_next)'d| <= κ*|∇ f(x)'d|
    f(x)-f(x_next) >= σ*δ*|∇ f(x)'d|.
    Here δ and κ are two parameters with default values 0.01 and 0.8. There always must hold
    0 < δ < κ < 1
    A very small κ means an almost exact unidimensional minimization in the direction -dk. We use the minimizer of the quadratic polynomial in σ defined by the values of f for σ=0 and σ=1 and the directional derivative -∇ f(x)'d at σ=0 in order to compute a first guess for σk. One can show that this estimate tends to the first local minimizer and satisfies the strong Powell-Wolfe conditions provided
    0 < δ < 1/2 < κ < 1 .
  • In the quadratic case f(x) = 1/2xTAx - bTx with a positive definite A this gives the exact unidimensional minimizer and the method reduces to the Gauss-Seidel linear solver for Ax = b,
  • We terminate the iteration if
    • the user function returns an error indicator with σ reduced below its lower bound (10-10)
    • if the maximum number of steps have been performed (4000)
    • if the stepsize algorithm found no progress for all n directions in succession
    • if f(x)-f(xnext) <= εf*(|f(xnext|+εf) for 2n successive steps. εf=10-12 : no useful progress obtainable.
    • if for n successive steps the relative change in the coordinates of x was less than εx*(||x||+εx) and the corresponding gradient component less than max{εg,machine-epsilon=2.2*10-16} (This is the ''good'' type of termination)
  • If the level set of f for x0 is bounded then this method will find a stationary point (to the precision allowed by the machine precision and the parameter settings)
 

Input

 
  • You have the choice between 14 predefined functions for which you can read a short examples description here. You also may define a function f of your own, depending on n variables x(1),...,x(n). You also have the choice to specify the initial guess x0.
  • Either you use the standard parameter settings for the termination criteria or you specify these yourself. For convenience we have these reduced to εx, δ, κ setting εgx.
  • For the predefined cases you have the possibility to replace the standard initial point by another one. The required dimension can be drawn from the input form.
  • You have the possibility to require a contourplot of f around the best point found with two of the n variables varying and the other components of x fixed at their current values. you can specify the size of the region. Be careful! Often these functions show rapid growth such that on a large region you may ''see'' nothing useful.
 

Output

 
  • cpu time used. If this appears as zero this means less than 0.01 seconds.
  • the termination reason.
  • The computed best values f(xopt) and x(1)opt,...,x(n)opt and the gradient components (each one referring to one of the last n steps).
  • The number of function and gradient component evaluations used.
  • Two plots in half-log format showing fk - fopt and ||∇ f(xk(i(k)))||.
  • The contour plot, if requested.
 

To the input form

 
Back to the top!

21.07.2010