Augmented Lagrangian method of Rockafellar resp. Kiwiel

Directly to the input form: predefined problem

Directly to the input form: user defined problem

 
  • Here you find a code for solving a constrained optimization problem using the method of augmented Lagrangians. For basic facts on optimization problems of this kind: f(x) = min, h(x)=0, g(x)>=0
    look here.
  • The augmented Lagrangian is obtained from the ordinary Lagrangian L(x,μ , λ ) by adding a so called penalty term multiplied by a penalty factor (named ρ here). This serves as a convexifyer making the problem at least locally satisfying the conditions necessary for applying the max_min technique mentioned in the theory page to a general problem, provided a local solution satisfies the strong second order sufficiency condition (look there). If the original problem is convex, then every ρ > 0 works, but in the general case a ''sufficiently large'' ρ is needed. Since one doens't know in practice what this means, we try here a sequence of geometrically growing values of ρ for this.
  • For fixed multipliers λ and μ we minimize the augmented Lagrangian φ with respect to x, getting a x* depending on λ and μ . The augmented Lagrangian φ given by Rockafellar resp. Kiwiel reads :
    φ(x, μ ,λ ; ρ) =
    f(x)-μTh(x)-||λ||2/(4*ρ)+ρ*(||h(x)||2 +||(g(x)-λ/(2*ρ))(-)||2

    resp.
    φ(x, μ ,λ ; ρ) =
    f(x) - μTh(x)+ρ||h(x)||2 + 1/(3 ρ) ∑i=1 to m ti

    with
    ti=max(0,sign(λi) × sqrt(abs(λi)) - ρ × gi(x))3-|λi|1.5)
    (The superscript (-) for a vector means the negative part (componentwise) of this vector.)
  • Rockafellars function is globally only once continuously differentiable but two times locally if the solution satisfies the strict complementarity condition, whereas Kiwiels function is globally two times continuously differentiable.
  • The value of the penalty parameter ρ is computed adaptively under the control of two parameters τ0 and γ in connection with the amount of infeasibility, see below.
  • After such an minimization step with respect to x we make a maximization step for the composed function φ(x(μ , λ ),μ , λ) with respect to μ and λ . The ''trick'' in our augmented functions is that here λ is not subject to the positivity constraint. This is obtained by forming equalities from the inequalities by subtracting positive slack variables and subsequent elimination of these slack variables by the help of the Kuhn-Tucker conditions. Therefore only (g(x)-λ/(2 × ρ))(-), the negative part of this term, appears in the penalty term of Rockafellars function and similarly this holds for the Kiwiel function. Fortunately the gradients and a good approximation of the Hessian of this complicated looking composed function are extremely cheap: the gradients are the values of the penalty terms multiplied by 2 and the Hessian approximation is (1/ρ) times the unit matrix. These facts result in the simple update
    μ+ = μ-2*ρ*h(x)
    und
    λ+ = max(0,abs(λ)(1/(β-1))-2*ρ g(x))(β-1)
    with β = 2 for Rockafellar and β = 3 for Kiwiel. The max is to be applied componentwise. Observe that the inequality multipliers always stay nonnegative.
  • This cycle is repeated until the Kuhn--Tucker conditions of the problem are satisfied to sufficient (user defined) precision.
  • For the unconstrained minimization w.r.t. x we use the BFGS quasi-Newton method.
  • The penalty parameter ρ runs from ρmin up to maximal ρmax, but usually remains fixed at a reasonable value (the ''sufficiently large'' one). We use the following heuristic for this: ρ gets increased by a factor ρfac if the infeasibility of x after the minimization step did increase by a factor γ or if it grows beyond τ0, or if during the application of the BFGS minimizer nonconvexity of φ w.r.t. x is detected (this occurs in the stepsize algorithm).
  • The method converges locally linearly, the rate of convergence getting better as ρ increases. A large ρ however might lead to failure of the minimizer. For a convex problem convergence is global for any ρ.
  • Termination occurs if
    • the Kuhn-Tucker conditions of the original problem are satisfied to sufficient accuracy .
    • the parameter ρ grows beyond ρmax. This will occur if the problem is infeasible (or the initial guess is so bad that feasibility is not detected.)
    • The unconstrained optimizer breaks down (usually in the stepsize selection) because the problem is too illconditioned.
    • too many steps required (we provide a total of 99999 inner iterative steps in total).
    • The stepsize algorithm failed (if ρ becomes large, then the ''valleys'' of of surface (x,φ(x,μ , λ)) become very steep and narrow, hence the line search might produce a too small step)
  • you can play with predefined testcases, see testcases for this, or define a problem of your own.
 

There are two input forms: Predefined and Userdefined

Input - predefined variant

 
  • Select a problem number from testcases The problem number always looks like hs followed by a number and then sometimes followed by scal (=scaled). You must write the problem number exactly as seen there. The scaled versions allow you to study the effects of bad or good scaling of a problem, which in real life applications is of outmost importance.
  • You have the choice between Rockafellar's or Kiwiel's function. Clearly Kiwiel's is theoretically ''cleaner'' since it fulfills the C2 property necessary for justifying BFGS globally, but is it practically better?
  • You have the choice to see the intermediate results displayed or only the final result.
  • There is a standard initialization for the algorithmic parameters but you can use or own choice:
    1. epsx: The inner minimization w.r.t. x is finished if
      ||∇x φ|| <= epsx*(1+epsx*|φ|)
      or
      ||x-xprevious|| <= epsx*(1+||x||).
      Important: epsx > 0, useful in [1.0e-14,0.01] !
    2. ρmin, ρmax: The first and the highest allowed value for the penalty parameter ρ.
    3. ρfac: the factor by which ρ gets increased, if this is found necessary. Of course ρfac > 1 !
    4. γ: Monotonic decrease of the infeasibility is not guaranteed with this method. In the inner iteration we allow an increase of ψ by the factor γ at most. Otherwise ρ will be increased. Important γ > 1 !
    5. τ0: the upper bound allowed for the infeasibility. This is
      ψ = ||h(x)||2+||(g(x))(-)||2 .
      We use this because far away from the feasible set the problem might behave quite irregular. If you don't know something useful here, you might use a large value. If ψ becomes larger than τ0, we reset x to its initial value, increase ρ and restart.
    6. you may use the standard initial guess or provide your own one. You must provide all n components of course, look at the problem description for this!
  • You might require a contour plot of the augmented Lagrangian w.r.t. two components of x with the others and the multipliers held at their optimal values. In this case you specify these two components and the width of enclosing box you want to see. you might also require that bounds on the variables will not be violated in the plot.
 

Input - user defined variant

 
  • You first specify the dimensions of your problem: n the number of variables and
  • nh, the number of equality constraints (can be zero) and ng the number of inequality constraints, can also be zero but (nh+ng > 0 )!
  • Then you specify the objective function fx and the functions decribing the equality constraints hxi and the inequality constraints gxi as described here.
  • You choose between Rockafellar and Kiwiel.
  • You specify your initial guess : of course you must type here n numbers.
  • Next you specify the parameters to be chosen:
    • τ0: Upper bound for the infeasibility of x. Should your input x deliver a value worse than this, it becomes increased and you will be told of this. Infeasibility means here ψ = ||h(x)||2+||(g(x))(-)||2 .
    • You choose the amount of displayed output.
    • There is a default setting of some parameters, but you might change them:
      1. epsx: The inner minimization is terminated if
        ||∇x φ|| <= epsx*(1+epsx*|φ|)
        or
        ||x-xprevious|| <= epsx*(1+||x||).
        Important: epsx > 0, useful in [1.0e-14,0.01] !
      2. ρmin, ρmax: The first and the highest allowed value for the penalty parameter ρ.
      3. ρfac: the factor by which ρ gets increased, if this is found necessary. Of course ρfac > 1 !
      4. γ: Monotonic decrease of the infeasibility is not guaranteed with this method. In the inner iteration we allow an increase of ψ by the factor γ at most. Otherwise ρ will be increased. Important γ > 1 !
    • You might require a contour plot of the augmented Lagrangian w.r.t. two components of x with the others and the multipliers held at their optimal values. In this case you specify these two components and the width of enclosing rectangle you want to see. you might also require that bounds on the variables will not be violated in the plot.
 

Output

 
  • A printout of the solution x, μ, λ, of the gradient of the augmented Lagrangian at this solution and a statistic of the steps: number of steps, of functions evaluations and type of BFGS updates. (mod steps are those where Powell's modification of the y vector was used in order to limit the growth of the norm of the quasi Newton matrix.)
  • If requested you also get a printout of the intermediate values: for the inner minimization, for multipliers and penalty parameter fixed:
    stepnumber, φ, || ∇x φ|| and ψ,
    the measure of infeasibility. For the outer iteration the new values of the multipliers and of course any change in ρ.
  • A plot showing the decrease of φ as φ-φmin
  • two plots showing lg(||∇ φ||) and lg(ψ).
  • The jumps in the plots are at the steps where ρ changes.
  • if required, the contourplot of the augmented Lagrangian at the solution.
 

to the input form: predefined problem

 

to the input form : user defined problem

 
Back to the top!

29.01.2016