SQP method

 

Directly to the input form: predefined cases

 

Directly to the input form: userdefined case

 
  • Here you find a code for solving constrained optimization problems using the SQP method, with preference for using equality constrained subproblems. This is P. Spellucci's ''donlp2''.
  • For basic facts on optimization problems of this kind: f(x) = min, h(x)=0, g(x)>=0
    look here. Dimensions of x, h, g here are n, p, m.
  • The theoretical basis for this method is the fact, that under the condition that in some neighborhood of the feasible region every point satisfies the extended Mangasarian--Fromowitz condition:
    MFCQextd : for all x there holds :
    ∇ h(x) has full rank and there exists a vector z such that
    ∇ h(x)T z = 0 and ∇ gA(x)T z > 0
    with A = { i : gi <= 0 }
    the following penalty function has no local minima other than the constrained local minima of the original problem:
    φ(x;γ , β ) = f(x)+∑i=1,..,p { γ i|hi(x)|} +∑i=1,..,m{ βi max{0,-gi(x)} }
    provided the weight factors γ and β are large enough. It can be shown that local estimates of the Kuhn--Tucker multipliers, increased by a fixed term, can be used as such weights. Hence unconstrained minimization of φ might serve for solving the original constrained problem exactly.
  • However, φ is not smooth, hence the usual methods of unconstrained minimization do not apply. It is however possible to show that the solution of a generalized linearization/approximation of the problem can be used as a local direction of descent for φ and this leads to the most successful solution method for constrained optimization, the SQP (sequential quadratic programming) method. Quite a lot of variants of this method have been developed in the last two decades. The first approach had been the full SQP approach:
    f(xk) + ∇ f(xk)T d + (1/2) dT Hk d = min
    subject to
    h(xk) + ∇ h(xk)T d = 0
    g(xk) + ∇ g(xk)T d >= 0

    is solved for d and the next approximation is sought along the ray xk + σ d with a line search for φ.
  • The matrix H in this model is meant as an approximation of the Hessian of the Lagrangian and chosen positive definite (or otherwise a trust region constraint is put on d). In this code we use a BFGS update of the augmented Lagrangian with a modification given by Pantoja and Maine.
  • Unfortunately, the linearized constraints in the QP problem above might be incompatible, which requires more modifications: sometimes the values of the violated constraints are multiplied by a damping factor, sometimes additional slack variables are introduced which are introduced also in the objective function with the aim to drive them to zero.
  • The solution of a full QP is quite costly and far away from the solution it is questionable whether the quality of the descent direction d pays for this. Hence the method used here tries first to replace this by the equality constrained problem
    f(xk) + ∇ f(xk)T d + (1/2) dT Hk d = min
    subject to
    h(xk) + ∇ h(xk)T d = 0
    gA (xk) + ∇ gA(xk)T d = 0

    which can be solved as a single linear system of equations. Here A is an estimate of the ''active set'' and computed as
    A = A(δ) = { i : gi(xk) <= δ }
    where δ is a measure of the distance of x from the next Kuhn--Tucker point:
    δ = ||∇xL(x, μ , λ )|| + ||h(x)|| + ||g(-)(x)|| + ||λ(-)||
    with the least squares multipliers μ and λ , which in turn are computed using A(0) as the first guess. Only if the matrix of the gradients of the active constraints is too illconditioned (or even rank deficient) the code switches to the full SQP step and regularizes the problem by individual slack variables to all constraints in the current active set (only). Let zh and zg be these slack variables, then a term
    τQP (||zh||1+||zg||1) + α(||zh||2+|||zg||2)
    is added to the objective function. It can be shown that under the extended MFCQ condition the slack variables become zero near a solution if τQP is chosen large enough. The code tries an increasing sequence of values for this and ''gives up'', if a predefined upper bound for it is exceeded.
  • The method is enhanced by two more features: for the bound constraints the technique of projection is used (the ray xk + σ d is projected on violated bounds) and a curved arc is used instead of the linear ray near a solution to allow the stepsize 1 locally (overcoming the ''Maratos effect''). This results in Q-superlinear convergence if the solution approximated satisfies the strong second order sufficiency condition.
  • There are a number of algorithmic parameters for which default values are given, which however are at your disposition (we use the notation of the input form here):
    1. tau0: the upper bound allowed for the infeasibility. This is
      ψ = ||h(x)||1+||(g(-))(x)||1 .
      We use this because far away from the feasible set the problem might behave quite irregular. If you don't know something useful here, you might use a large value. If ψ becomes larger than τ0 then the method tries pure ''restoration steps'' (treating f as zero) in order to get rid of this.
    2. epsx: parameter expressing ''sufficient precision''.
    3. del0: the upper bound for δ
    4. delmin: a lower bound for δ (necessary to cope with roundoff effects. A point with ψ <= delmin is considered as ''sufficiently feasible'').
    5. rho: if the condition number of the matrix ( ∇ h , ∇ gA )(x) ) is greater than 1/rho then we switch to the full regularized SQP step.
    6. rho1: if the condition number of the quasi Newton update Hk is greater than 1/rho1 we use a restart with a scaled multiple of the unit matrix for it.
    7. smallw: an inequality multiplier is considered as ''wrong'' and hence its constraint for a candidate leaving the active set if λi <= -smallw .
  • The iteration terminates successfully if the following is met:
    • ||∇ x L || <= epsx(1+||grad f(x)||),
    • ψ(x) <= delmin ''primal infeasibility''
    • λi >= -smallw ''dual infeasibility''
    • |λ|T|g(x)| <= delmin*smallw*(nh+ng)
    There are of course more exit conditions, we show the total list with some explanation here:
    1. ''QPSOLVER: WORKING SET SINGULAR IN DUAL EXTENDED QP '', a severe roundoff effect in the QP solver, theoretically this is not possible.
    2. ''QPSOLVER: EXTENDED QP-PROBLEM SEEMINGLY INFEASIBLE '', same reason as the foregoing.
    3. ''QPSOLVER: NO DESCENT DIRECTION FROM QP FOR TAU=TAU_MAX'', the extended Mangasarian Fromowitz condition seems to be violated by the current problem resp. x
    4. ''QPSOLVER: ON EXIT CORRECTION SMALL, INFEASIBLE POINT'', same reason as the foregoing.
    5. ''STEPSIZESELECTION: COMPUTED D NOT A DIRECTION OF DESCENT'': in some situation we try the direction d coming from the QP solver, although this one signaled trouble already. This then cancels the run.
    6. ''MORE THAN MAXIT ITERATION STEPS'': too much work required in order to reach the required precision. Possibly a very illconditioned or badly scaled problem with many restarts.
    7. ''STEPSIZESELECTION: NO ACCEPTABLE STEPSIZE IN [SIGSM,SIGLA]'': typically the effect of wrong gradients (which cannot occur here) or severe illconditioning or too large penalty weights (a too small rho ?)
    8. ''STEPSIZESELECTION: DIRECTIONAL DERIV. VERY SMALL, INFEASIBLE'': a local infeasible minimizer of φ has been found, at least locally the problem is infeasible.
    9. ''KT-CONDITIONS SATISFIED, NO FURTHER CORRECTION COMPUTED'' : the successful exit, see above.
    10. ''KT-CONDITIONS SATISFIED, COMPUTED CORRECTION SMALL'': essentially the same as the foregoing.
    11. ''STEPSIZESELECTION: X ALMOST FEASIBLE, DIR. DERIV. VERY SMALL'': a very flat φ, continuing the iteration seems meaningless, maybe your precision requirements were too strong.
    12. ''KT-CONDITIONS (RELAXED) SATISFIED, SINGULAR POINT'': if the full SQP step is taken we relax automatically the parameter epsx by a factor 100, the other termination conditions being unchanged.
    13. ''VERY SLOW PRIMAL PROGRESS, SINGULAR OR ILLCONDITONED PROBLEM'': x is sufficiently feasible and the complemetarity condition sufficiently satisfied, but the objective function stagnates.
    14. ''VERY SLOW PROGRESS IN X, SINGULAR PROBLEM'': no significant changes in x for at least 4 steps during the full SQP.
    15. ''CORRECTION VERY SMALL, ALMOST FEASIBLE BUT SINGULAR POINT'': ||d|| < epsx*0.01*min(1,||x||) during the full SQP step.
    16. ''MAX(N,10) SMALL DIFFERENCES IN PENALTY FUNCTION,TERMINATE'': φ doens't change significantly any more.
  • you can play with predefined testcases, see testcases for this, or define a problem of your own.
 

There are two different input forms: Predefined and
Userdefined

Input - predefined case

 
  • Select a problemnumber from testcases The problem number always looks like hs followed by a number and then sometimes followed by scal (=scaled). You must write the problem number exactly as seen there. The scaled versions allow you to study the effects of bad or good scaling of a problem, which in real life applications is of outmost importance.
  • You have the choice to see the intermediate results displayed or only the final result.
  • There is a standard initialization for the algorithmic parameters but you can use or own choice, for the meaning see above. On the input form you find more information about this.
  • you may use the standard initial guess or provide your own one. You must provide all n components of course, look at the problem decription for this!
  • You might require a contour plot of φ w.r.t. two components of x with the others and the multipliers held at their optimal values. In this case you specify these two components and the width of enclosing rectangle you want to see. you might also require that bounds on the variables will not be violated in the plot.
 

Input - userdefined case

 
  • You first specify the dimensions of your problem: n the number of variables and
  • nh=p, the number of equality constraints (can be zero) and ng=m the number of inequality constraints, can also be zero but (nh+ng > 0 )!
  • Then you specify the objective function fx and the functions describing the equality constraints hxi and the inequality constraints gxi as described here.
  • You specify your initial guess : of course you must type here n numbers.
  • You choose the amount of displayed output.
  • You have the possibility to set your own choice of the algorithmic parameters, for this see the list above in the theory section.
  • You might require a contour plot of φ w.r.t. two components of x with the others and the weights held at their optimal values. In this case you specify these two components and the width of enclosing rectangle you want to see. you might also require that bounds on the variables will not be violated in the plot.
 

Output

 
  • You get displayed text with the termination reason, the final values of the termination criteria, the number of function and gradient evaluations, the solution and a list of the constraint values and the associated multipliers.
  • If required you get a condensed table of the intermediate values: FX=objective function value (maybe in couple of zeroes in the beginning, if the initial infeasibility were worse than tau0, B2N=||Hk1/2xL(..)||, UPSI=ψ UMI=min{0,λi,i=1,...,m}, NR=number of elements in A, SI=1 if this is a full SQP step, -1 otherwise.
  • a plot showing B2N and UPSI in a semilogarithmic scale,
  • a plot showing ||λ(-)||, the dual infeasibility and |λ|T|g(x)|, the complementarity.
  • If required a plot of φ
 

To the input form: predefined cases

 

To the input form: userdefined case

 
Back to the top!

20.01.2016