NumaWWW: Numerical Mathematics in the World Wide Web

	The Gauss-Newton method
	Directly to the input form: exponential fit
	Directly to the input form: user defined case
	The Gauss-Newton method solves nonlinear least squares problems. In abstract terms this means minimizing the euclidean length of a vector function F of m components with respect to the variable vector x with n components where we assume here that m >= n. In applications the situation is similar to many applications of linear least squares: for a set of data (t_i,y_i) one seeks the parameter x of a function model f(x;t) such that the sum of squares of deviations f(x;t_i) -y_i is minimal. The components of F are made up from these deviations (or ''residuals''). Hence *S(x) = (1/2)\|\|F(x)\|\|₂² and F_i(x) = f(x;t_i) - y_i. The function f(x;t) depends nonlinearly on x. This method tries to find a zero of the gradient of S(x) , which of course is only a necessary optimality condition. Since this is achieved by monotonic decrease of S using a decrease depending on the norm of the gradient it is hoped that indeed a minimizer is found. Newton's method itself could also be used for finding a zero of ∇ S(x). This would require the computation of the Hessian matrix of S (matrix of second partial derivatives). This reads, using the Jacobian J_F(x) of the vector function F ∇² S(x) = J_F(x)^TJ_F(x) + ∑_{i=1 to m} F_i(x)* ∇² F_i(x). Far from the minimizer this could be indefinite and fail to produce a direction of descent for S. This, besides the necessity of computing the second partial derivatives of F, is the reason why the Gauss-Newton direction often is preferred. Here only the always semidefinite part of the Hessian is used. Indeed we assume stronger that the Jacobian J_F is always of full rank, which implies that J_F(x)^TJ_F(x) is always positive definite. The direction of descent for S is defined as d(x) = -(J_F(x)^TJ_F(x))^-1 J_F(x)^TF(x) d(x) is not computed from this formula but rather by solving the linear least squares problem \|\|F(x) + J_F(x)d\|\|₂²= min_d* . This is done using numerically stable techniques, here we use the Householder QR-decomposition of J_F. In order to guarantee convergence to a zero of the gradient at least it is in addition necessary to decrease S along this direction sufficiently. This needs the computation of a steplength τ. The next iterate then is obtained as x_new = x_old+τ·d. This stepsize is computed by a combination of interpolation of S(x+τ ·d) and reduction of the interpolation interval (which always has the left endpoint 0, where the data S(x) and (d/dτ)S(x+τ ·d)_τ=0 are used.) One can show that this always enforces convergence, provided the full rank assumption for J_F on the initial level set is satisfied. Order of convergence of this process is linear (only in the case of *F(x^)=0, the perfect fit (interpolation), it is quadratic). The rate depends on magnitude of the optimal residual and is of the form C\|\|F(x^)\|\| < 1 where the constant also depends on the condition of J_F**. The numerical code used here is NLSCON from the Elib of ZIB-Berlin.

	There are two input forms: one with a predefined exponential fitting problem and one where the user can define the fit himself.

	Input: the predefined exponential fit
	Here the data model f(x;t) is a sum of exponentials such that *F_i(x)=a₀ + a₁exp(b₁t_i) + .... + a_nexp(b_nt_i) - y_i* with x = (a₀,a₁,b₁, ... ,a_n,b_n)^T. You specify the number of exponentials n. You specify the 2n+1 ''true'' coefficients x = (a₀,a₁,b₁,...,a_n,b_n)^T. You specify an interval [a,b] for t, in which the data will be generated equidistantly. You specify the number m of data points. Important: 2n+1 <= m <= 200 ! A variable level which is used for generation of artificial ''measurements errors''. Important: 0 <= level <= 1 ! Now m data (t_i,y_i) will be computed using the model with the ''true'' coefficients and afterwards the values y_i are perturbed using random numbers r_i in [0,1] using the formula *y_i = y_i,orig+(r_i-1/2)level2max{\|y_i,orig\|} Subsequently the model is fitted to the perturbed data. Attention: don't hope for wonders! fitting by exponentials is a notoriously badly conditioned problem. This becomes clear if you consider it as a problem of parameter identification for a linear differential equation with constant coefficients, with the solution given by a discrete and noisy set of data. You have the possibility to change some of the parameters of the code: the required precision of the solution vector rtol in [1.0d-12,1.0d-2] The minimal stepsize allowed for the length reduction of the full Gauss-Newton step sigmin in the range [1.0d-12,1.0d-2] the maximal condition number of the jacobian before regularization (by truncation) occurs: maxcond in the range [2,10¹⁶] the maximal number of steps allowed maxstep >= n the amount of printed output iprint**: 0 = none, 1 = only final result, 2 = short iteration protocol plus final result 3 = detailed iteration protocol (not recommended)

	Output: the predefined exponential fit
	cpu time used. A table with the original and the computed (perturbed) coefficients. The sum of squares of data perturbations. The sum of squares of residuals after fitting. A plot which shows the perturbed data and the fitfunction.

	Questions: exponential fit
	How influences an increase of the perturbation the performance of the method? How sensitive react the coefficients on an increase of the perturbation level for different choices of the exponents? What takes place if two of your chosen exponents are nearly equal or if the decay is very small ?

	To the input form: exponential fit

	Input: user defined case
	Here you can define the model function f(x;t) yourself. You have the possibility to fit own data in which case your input for x serves as the initial guess or may have the data generated internally. In the latter case your input for x serves as the ''true'' parameter value. data are computed from this using your model function and then perturbed using random perturbations. The perturbed data are subsequently used for fitting. You specify the dimension np of x. Important: np <= 20 ! You specify the model function f(x;t) by a piece of code depending on x(1),...,x(np) and t. This must obey the rules of FORTRAN and the restrictions of JAKEF. You specify the initial vector x(1),...,x(np) with np entries. The Jacobian is produced analytically here internally using the code JAKEF from NETLIB. You have the choice to specify explicitly your data or to let the program generate data artificial using your specifications. In the latter case you define: An interval [a,b] from which the variable t is drawn equidistantly. The number m of data points. Important: np <= m <= 200 ! An error level level. Important: 0 <= level <= 1 ! m data points (t_i,y_orig,i=f(x;t_i)) are computed and subsequently the data y_i are computed from *y_i = y_i,orig+(r_i-1/2)level2max{\|y_i,orig\|} where r_i are random numbers in [0,1]. These data are then fitted. Direct input of data points: You specify their number m. Important: np <= m <= 200 ! You type the data in the prepared textarea, the data items t(i),y(i) separated by comma or blank, not necessarily one per line, for example 0,0 1,1 2,1.4 .. Finally you have the possibility to change some of the parameters of the code: the required precision of the solution vector rtol in [1.0d-12,1.0d-2] The minimal stepsize allowed for the length reduction of the full Gauss-Newton step sigmin in the range [1.0d-12,1.0d-2] the maximal condition number of the jacobian before regularization (by truncation) occurs: maxcond in the range [2,10¹⁶] the maximal number of steps allowed maxstep >= n the amount of printed output iprint**: 0 = none, 1 = only final result, 2 = short iteration protocol plus final result 3 = detailed iteration protocol (not recommended)

	Output: user defined fit
	cpu time used by the minimizer. Table of original (resp. initial) coefficients and the fitted ones. sum of squares of generated errors (if applicable) sum of squares of residuals for the final solution. A plot which shows the fitdata and the fit function.

	Questions: user defined case ?!
	What do you observe if the fit is ''bad'' (residuals resp. perturbations are large) When do you observe large perturbations in the coefficients? How depend these effects on the model function?

	To the input form: user defined case

Back to the top!

26.08.2020

The Gauss-Newton method

Directly to the input form: exponential fit

Directly to the input form: user defined case

There are two input forms: one with a predefined exponential fitting problem and one where the user can define the fit himself.

Input: the predefined exponential fit

Output: the predefined exponential fit

Questions: exponential fit

To the input form: exponential fit

Input: user defined case

Output: user defined fit

Questions: user defined case ?!

To the input form: user defined case