The Levenberg-Marquardt method: nonlinear equations
	Directly to the input form
	The Levenberg-Marquardt method solves nonlinear systems of equations F(x) = 0 with F: \|Rⁿ -> \|Rⁿ using a so called ''trust region'' concept in minimizing S(x) = \|\|F(x)\|\|₂² . Given an initial guess x₀ the process is iterative: x_k+1 = x_k+d_k. Here d_k is the solution of an inequality constrained linear least squares problem: \|\|F(x_k) + J_F(x_k) d \|\|² = min_d subject to \|\|D_kd\|\| <= δ_k . () D_k* is a diagonal matrix which serves as a weighting of the corrections. In principle the norm appearing in the constraint could be different from the euclidean one, but here we also use the euclidean norm. D_i,i is chosen internally as the maximum of one and the length of the i-th column of J_F. The linear least squares problem is solved using the QR decomposition of J_F. J_F is the Jacobian of F and in this application we compute it analytically using JAKEF from your function input. The solution of the constrained problem can in this case be completely characterized by the following theorem of Sorensen (SINUM 19, (1982), 409-426) (we omit the index k in the following) d is a solution of () if and only if the following two properties hold λ >= 0* and \|\|d\|\| <= δ and λ(δ-\|\|d\|\|) = 0 (J_F^TJ_F(x)+λ D²)d = - J_F^TF(x) () This means that either the unconstrained least squares solution solves the problem or a parameter λ must be found which solves () with the norm of d equal to δ. Hence one solves firstly the unconstrained least squares problem. If the solution does violate the norm constraint then one computes λ such that () holds. This equation is equivalent to the linear least squares problem obtained from the original one by appending sqrt(λ)D to J_F and a vector of n zeroes to F and gives d as a function of λ. So δ-\|\|d(λ)\|\| = 0 becomes a scalar nonlinear equation in λ which can be solved rather efficiently by updating techniques for the QR decomposition. λ is known as the ''regularization parameter''. This allows a rank deficient J_F in the problem. The so called ''trust region radius'' δ_k is computed adaptively depending on the difference between the true decrease \|\|F(x_k)\|\|²-\|\|F(x_k+1)\|\|² and the expected decrease, obtained from the linearized model \|\|F(x_k)\|\|²-\|\|F(x_k) + J_{F(x_k)}d_k \|\|² The variable ρ in graphical output (plot number 4) is the quotient of these two. With ρ near one (here: >= 0.75) or with ρ between 0.25 and 0.75 and the regularization parameter λ =0 (plot number 3) , δ_k becomes increased, with ρ between 0.25 and 0.75 and λ > 0 or 10^-4 <= ρ < 0.25 it stays fixed and with ρ < 10^-4 it becomes diminished and the step repeated. In theory,without rounding effects, this ends always successfully as long as x is not a stationary point of S. This is for a solvable system with regular Jacobian a quadratically convergent method (up to the steplength computation identical to the classical Newton's method) as long as λ stays zero. If λ stays nonzero all over the iteration, then the convergence can be very slow. The termination of the iteration is controlled by three parameters: xtol, ftol and gtol. These are bounds for the relative change in x, \|\|F\|\| and the so called ''scaled gradient'', this is the maximal angle between the vector F and the columns of its Jacobian J_F. If one of these is reached, termination occurs. There is an ''emergency exit'', if a maximum number of function evaluations is reached. Defaults are 10^-7, 10^-10, 10^-7, 10000** respectively. The numerical code used here is lmder of Jorge J. More (from netlib/minpack).

	Input
	You have the possibility to play with 6 prepared examples, of which a description is to be found here or you define a problem of your own. If this latter case You specify the number of parameters n (the dimension of x). Important: 1 <= n <= 100 ! The function F as a piece of FORTRAN code. This must depend on the x(k), k=1,...,n . You always specify initial values for x(k), k=1,...,n. You can choose your own values for xtol, gtol , ftol and the maximum number of function evaluations if you want so or use the defaults. You may specify the print frequency of selected variables of x, this is the variable nprint and the computation of the initial trust region radius δ₀ which is either *factor\|\|D₀x₀\|\| if this is nonzero or factor** itself.

	Output
	You get several plots: The first one shows the log10 of the difference \|\|F(x_k)\|\|-\|\|F_min\|\|. Here -99 means "zero". The second shows the gradient norm (red) and δ_k (green) , both in the decadic logarithm. Here ''-20'' means ''zero''. The third one shows λ and the fourth ρ (red), again in a logarithmic scale, with ''-20'' representing zero resp. -infinity. You get a text output showing your parameter choice, the termination reason and the solution.

	To the input form

Back to the top!

01.08.2013