Linear least squares: fitting a surface

Directly to the input form:

 
  • The method of least squares is not limited to cases of scalar functions of one real variable. Contrary, the independent variables (assumed as error free) and the dependent variables as well might be vectors of any dimension. In this case any component of the dependent variable gets its own row in the data matrix, whereas the independent variables which enter the ''Ansatz'' functions simply make up more columns (usually). Hence if you have M data of dimension m and n functions in your ''Ansatz'', then the system matrix will by M*m times n. (That implies that the different components of the dependent variable might have different formulae!) The same applies if the ''Ansatz'' is nonlinear. Here we have as application the fit of a surface of explicit form
    z = f(x,y) = i=1 to n ai φi(x,y)
    to given data (xj,yj,zj), j=1,...,m.
    The functions φi(x,y) might be either specified by the user himself or taken from the the polynomial basis 1, x, y, x2, xy, y2, x3, x2y, y2x, y3 depending on the choice of the maximum degree in 1, 2, 3. The coordinates of the independent variables (xk,yk) can be arbitrarily scattered in the plane. In the graphical output the domain of definition is extended to the smallest rectangle enclosing these points. We assume that the system matrix has full rank and ''give up'' otherwise. As least squares solver the QR decomposition with Householder reflections is used.
 

Input:

 
  • You can choose whether to use the polynomial fit or your own ''Ansatz''.
  • In case of the polynomial fit you specify the degree. degree 1 gives n=3, degree 2 n=6 and degree 3 n=10 unknowns. That means the surface then is
    1. a1+a2*x+a3*y
    2. a1+a2*x+a3*y+a4*x2+a5*x*y+a6*y2
    3. i=0 to 3 j=0 to i an(i)+j xi-jyj
      with n(0)=1, n(1)=2, n(2)=4, n(3)=7
  • In case of an ''Ansatz'' of your own you must specify the n functions φi(x,y) in the prepared input fields.
  • You have the choice either to generate all the data artificially or to explicitly provide your own data. In the first case you specify:
    1. the ''true'' coefficient vector a with n components
    2. The lower and upper bounds for x and y.
    3. The number of data points. The (x,y)-data are chosen randomly (with rectangular distribution) from the rectangle and the corresponding z-value is then computed from a.
    4. An error level (acting in percent of the maximum absolute value zmax of the zk,true). That means that finally
      zk= zk,true + 2*(0.5-random())*level*zmax
  • In case of your own data you simply provide their number m and 3 × m numbers which are taken as m triples (x,y,z).
 

Output

 
  • The sum of squared deviations after (and in the of generated data before) the fit.
  • A table of the computed coefficients (and in case of generated errors the ''true'' input values)
  • A 3D plot which shows the surface and the data points.
 

Questions: ?!

 
  • If you use different error levels with generated data, what information may you draw from the deviation of the coefficients?
  • What should you know in order to predict those deviations?
  • How work different choices of an ''Ansatz'' (different bases of the same linear space)?
  • Which role plays the choice of the (x,y)-area? (Think about variable transformations!)
 

To the input form:

 
Back to the top!

08.10.2010