Newton's method: 2 variables

Directly to the input form

 
  • In the following x, y denote the two (real) components of the variable vector and (f1,f2) are the two components of the vector function F.
  • Newton's method is the best known method for solving systems of nonlinear equations F(x,y) = (f1,f2)(x,y) = 0.
  • We assume in the following that F is at least two times continuously differentiable, that a solution of the problem exists and that the Jacobian of F is invertible there. Then the method and its properties can be derived using Taylors theorem. From
    F(x*,y*)-F(x,y) = JF(x,y)e + O(||e||2)
    with
    e=(x*,y*)-(x,y)
    we get from our assumptions that
    ||F(x,y)|| = O(||e||) and ||e|| = O(||F(x,y)||). (**)
    Neglecting the second order term in e we get a linear system of equations for e
    JF(x,y)e = -F(x,y).
    We use this for (x,y)=(xk,yk) in order to define the next iterate (which replaces (x*,y*) in the formula):
    (xk+1,yk+1) = (xk,yk) + ek.
    From this, using Taylors formula for F(xk+1,yk+1)-F(xk,yk) we get
    ||F(xk+1,yk+1)|| = O (||F(xk,yk)||2)
    hence using (**) again
    ||ek+1|| = O(||ek||2)
    and this shows that the method is locally quadratically convergent under these assumptions.
  • The invertibility of the Jacobian matrix is an indispensible requirement for the method, but one time continuous differentiability suffices for proving superlinear convergence.
  • The method can be interpreted geometrically: at the current iterate compute the two tangent planes for the two surfaces (x,y,z=fi(x,y)). Invertibility of the Jacobian means that these tangent planes are not parallel, hence intersect in a line. This line will meet the plane z=0 in 3-space in the next iterate.
  • There exists a famous theorem of Kantorovich which makes the statement about locality of convergence quantitative and even proves existence of a solution. Its conditions however are hard to check in practice, hence ''in real life'' one simply hopes that F is formulated reasonably and that the users knowledge suffices to provide a sufficiently good initial guess.
  • This method extends to any number of variables.
  • For large dimension the computation of the Jacobian and the linear systems solution however might become quite costly.
  • We terminate if the Jacobian is judged as singular, if the iteration leaves a user specified box, if more than 200 steps are taken or if the desired precision has been reached.
 

Input

 
  • For specifying the function F we have four predefined cases but you also may specify the two component functions by a piece of code which computes f1 and f2 given x, y. In this case you must follow the conventions of FORTRAN. The variable names must be x and y.
  • In case of a function of your own you also must specify a bounding rectangle [xmin,xmax] × [ymin,ymax] in the plane where the solution should lie. We use this for detecting possible divergence: computation terminates if the computed sequence leaves this rectangle.
  • You specify a variable ε. We consider the precision obtained as sufficient if the l1-norm of the correction (i.e. |&deltax|+|δy|) is less than (ymax-ymin+xmax-xmin) × ε.
  • Your initial guess xstart and ystart which of course must lie strictly inside the bounding rectangle [xmin,xmax] × [ymin,ymax].
  • Since we produce an animated 3D-Plot you also must specify your view point. This is done by two rotation angles for the x- and the z- axis: Originally the coordinate axes are screen horizontal (x) , screen vertical (y) and vertical to the screen (z).
  • The visibility in this plot limits a senseful value of ε . But you may use the output of one run with a smaller box and a smaller ε for more seeing more details.
 

Output

 
  • A table of the iterates , the two function values and the l1- norm of the inverse of the Jacobian.
  • We judge a Jacobian as singular if its determinant is less than 1.0e-8 times the sum of absolute values of its four entries.
  • We perform at most 49 steps in order to limit the amount of output.
  • The termination criterion is : l1-norm of correction less than
    ε × (xmax-xmin) × (ymax-ymin) .
  • An animated plot which shows the two surfaces (x,y,f1), (x,y,f2) and the plane (x,y,z=0) in different colors (the solution(s) being at points where these three colors meet) and the iterates (as bars above the surfaces).
 

Questions ?!

 
  • In which initial points you will always get problems?
  • If there are several solutions: will it always work?
  • How might problems show up ?
 

To the input form

 
Back to the top!

18.05.2016