|
- In many applications one wants to solve a minimization (or maximization)
problem, where explicit computation of derivatives of the objective
function is impossible, although existence of such derivatives
is theoretically guaranteed. In such situations the methods of
"derivative free optimization", DFO for short, are useful.
Using finite difference approximations instead of analytical derivatives
is often problematic, especially if the function values are not given
to very high precision, as is often the case in computations involving
simulations of continuous processes. If the existence of second derivatives
is known then second order approximations are justified.
- Here we use a new method of M.J.D. Powell, namely his code NEWUOA,
which uses a quadratic model for the objective function, whose
minimizer under a trust region constraint is used as the next
approximation of the minimizer.
This quadratic model is obtained by interpolation of function data on a
grid in general position which guarantees unique solvability of
the corresponding linear system.
For the case n=2 you may imagine a triangle and as interpolation points
the three vertices and the midpoints of the edges as interpolation points.
By a linear transformation to the standard triangle with the vertices
(0,0), (1,0), (0,1) one obtains a simple linear system fixing
the six parameters of a
quadratic interpolation in
two variables, which allows retranslation to the general position by invertibility
of the linear transformation.
In general the model has the form
fappr(x) = f(x0)+gT*(x-x0)
+(1/2)(x-x0)T*H*(x-x0) .
g is an approximation for the gradient and H for the Hessian
of f at x0 .
x0 is the best point found so far.
This model is then minimized on a ball of radius Δ, this radius
being computed adaptively itself. The user gives an initial value and a final value for it.
This is the so called "trust region radius".
The minimizer is a candidate for the next best approximation for f's minimizer.
It becomes accepted if the new function value is sufficiently smaller than
f(x0), compared with the decrease obtained in the quadratic model.
If this is not obtained (for example because the function behaves far from the
quadratic model), then the trust region radius is decreased and the minimization
of the model repeated, and with very good accordance the radius may also be increased.
The new point replaces one of the old interpolation points, allowing updating
techniques in the linear system solves.
If the interpolation points get more and more widespread (specifically if their distance
becomes larger than 2 Δ), then points "far away" are replaced by ones nearer
to the newest point but in such a way that invertibility of the linear systems involved
in the interpolation model is guaranteed.
- A complete quadratic model needs (n+1)(n+2)/2
(x,f(x)) pairs and this becomes unsupportable expensive for larger n.
Hence it is possible to use a smaller number of new interpolation points only around
the next best approximation point, namely in the
range [2n+1,(n+1)(n+2)/2] which is of course not sufficient to fix the Hessian
completely if chosen less than the upper bound.
The computation of H+, the next value for it,
is then accomplished using the principle of minimal
change (measured in the Frobeniusnorm) compared to the previous value of H,
i.e. one solves
||H-H+|| = min w.r.t. H+
subject to a symmetry condition and the interpolation conditions for the model
(i.e. linear equality constraints).
This results in a linear system of equations which is solved using an explicit inverse which is
updated from step to step.
The result is a Hessian approximation similar to the so called
''PSB-Update'', a special quasi Newton method in the field of differentiable minimization
- Termination of the computation occurs
if the trust region radius becomes smaller than the input parameter rhomin,
which should be a measure of the precision in x desired by the user, if
the maximum allowed number of function evaluations (set by the user) has been exceeded prior of
reaching this or if, exceptionally, the trust region computation did not produce a decrease in the
quadratic model, which can occur if excessive roundoff due to illconditioning is present.
- A detailed description of the method appeared in
M.J.D. Powell: "The NEWUOA software for unconstrained optimization without derivatives", in
"Large-Scale Nonlinear Optimization" eds. G. Di Pillo and M. Roma, Springer 2006, pages 255-297.
|
|
- You have the choice between 14 predefined functions for which you can read a short
examples description here.
You also may define a function f of your own,
depending on n variables x(1),...,x(n).
You also have the choice to specify the initial guess x0.
- You may use the default values for the some algorithmic parameters or
specify your own choice. These parameters are:
rhobeg, the first value of the trust region radius (not too small!). Powell
recommends 10 percent of the maximum expected change in a component of x for this.
rhomin, the smallest trust region radius allowed, stands for the desired
precision in the solution x,
maxfun, recommendation for this: 30*n*npkt at least,
npkt, the number of new evaluation points to be chosen after each successful
trust region step, and the amount of intermediate output, iprint.
- For the predefined cases you have the possibility to replace the standard initial point by
another one. The required dimension can be drawn from the input form.
- You have the possibility to require a contourplot of f around the best point found
with two of the n variables varying and the other components of x fixed at their
current values. you can specify the size of the region. Be careful! Often these functions show
rapid growth such that on a large region you may ''see'' nothing useful.
|
|
- The protocol of the intermediate results, if required.
- The total cpu time. Zero means "less than 0.01 seconds".
- The termination reason.
- The best values f(xopt) and
x(1)opt,...,x(n)opt.
- The number of function calls.
- Four plots: f - fopt,
and the euclidean length of g (the approximate gradient in the quadratic model),
both in semi logarithmic scale,
the current number of function evaluations and the actual trust region radius, anything over the
step number.
- The contour plot, if required.
|