Interpolation with positivity constraint

Directly to the input form

By interpolation one means a process to replace a function, given by a finite set of data, by a "simple" function reproducing these data and given as a (in the range of interest) everywhere evaluable formula, with the aim of evaluation or otherwise mathematical manipulation (say integration, differentiation etc). In applications the function values often have a physical meaning which requires strict positivity. Then the usual simple interpolation by polynomials, rationals or splines is not necessarily applicable. But there is a simple and easy to compute function which fulfills the positivity constraint automatically: the exponential function. Hence it is an obvious idea to interpolate the logarithm of the data by a polynomial and to take the exponential of this polynomial as the interpolating function for the data. We extend this idea here and allow a shift of the ordinates prior to taking the logarithm.
We have data x_i, y_i, i=0,...,n, (with the meaning of y_i = f(x_i), f not necessarily known) and x_i pairwise different . With a shift s such that y_i+s > 0 ∀ i we compute
1. the unique polynomial p of maximum degree n with
  p(x_i) = log(y_i+s), i=0,...,n .
2. then we take
  g(x) = exp(p(x))-s as the approximating function for f. Obviously
  g(x_i) = exp ( p(x_i) ) -s = exp ( log( y_i+s ) ) - s = y_i
  as required.
The polynomial p is represented here in the Newton's form
N_i(x) = Π_j=0,..,i-1 (x-x_j), i=1,...,n
N₀(x) = 1
p(x) = ∑_{i=0 to n} c_0,i N_i(x)
The computation of the coefficients is done using the scheme of "divided differences"
c_i,0 = log(y_i+s), i=0,...,n
c_i,j+1 = (c_i+1,j-c_i,j)/(x_j+i-x_i), j=0,..,n-1, i=0,..,n-j
The computational effort for the coefficients is O(n²) and O(n) for a single polynomial value, which can be done especially efficient using the common factors of the N_i.
If y_i = f(x_i) and f is n+1 times continuously differentiable then the interpolation error between p and ψ(x) = log(f(x)+s) is
R_n(x) = ψ⁽ⁿ⁺¹⁾(x^~)/((n+1)!)*(x-x₀)*...*(x-x_n)
where x^~ is an unknown value between x and the x_i and hence
<= M_n+1/((n+1)!)*(b-a)ⁿ⁺¹
for an arbitrary distribution of the abscissae in [a,b], where M_n+1 is an upper bound for the absolute value of the n+1-th derivative of ψ on [a,b]. If this derivative has no zeroes on [a,b] and m_n+1>0 is a lower bound for its absolute value on [a,b] then there is at least one x in [a,b] where
|log(f(x)+s)-p(x)| >= m_n+1/((n+1)!)*2*((b-a)/4)ⁿ⁺¹
whatever (optimal) distribution of abscissae one chooses. This shows that this method always gives good results if the interval span is small, but not in the case that there exists a (possibly complex) singularity of ψ whose distance from [a,b] is smaller than 1/(b-a) , because then the growth of the derivatives (in n) is stronger than this.

The error of the final approximation g(x) for f(x) then is
|f(x)-g(x)| <= |f(x)+s|( exp(| R_n(x)| ) -1 )
and
exp(| R_n(x)| ) -1 = | R_n(x)| + O(| R_n(x)|²)
and the discussion of the remainder formula above shows that the derivatives of ψ play a crucial role herein. Now

(d/dx) log(f(x)+s)	=	f'(x)/(f(x)+s)
(d/dx)² log(f(x)+s)	=	f''(x)/(f(x)+s) - (f'(x)/(f(x)+s))²
(d/dx)³ log(f(x)+s)	=	f'''(x)/(f(x)+s) - 3 (f'(x)f''(x))/(f(x)+s)² + (f'(x)/(f(x)+s))³

and so on, such that it looks promising to choose a large positive shift s in order to diminish this error term. This however has its limits: first of all, s appears also as multiplicative factor and second the addition of a large number may spoil the precision in the function values. For example, think of f(x) = sin(2 π x) and s = 10⁴. Then all values y_i+s are in the range [9999,10001] and hence their logarithm in the range [9.210240366,9.2104403667], almost constant. The polynomial interpolation of this will produce a very small interpolation error, but the last four digits of the numerical sin-values are lost. Finally, by subtraction of 10000 after evaluation of exp again four digits precision are lost.
Nevertheless it is interesting to play with s.

Input

You either might interpolate a discrete data set of your own (getting then a plot showing the approximating function g(x)=exp(p(x))-s and your data) or might choose a demonstration using predefined or self defined functions for the artificial generation of the y-data. You may choose between taking no shift at all, just shifting the ordinates above the value 1, i.e.
shift = max ( 0 , 1 - y₀ , ..., 1-y_n )
or an userdefined shift.
In both cases you have the choice to see the table of divided differences, from which you might draw information about the growth of ψ's derivatives, since column j, j=0,...,n contains values of ψ(.)^(j)/j! at unknown intermediate points (as long as the computed values are not spoiled by roundoff).
Also in both cases you might have a printed formula of the polynomial as provided by the generalized Horner's scheme.
There are 5 predefined cases of functions with different growth properties of their derivatives. You also might define a function of your choice using a piece of code following FORTRAN conventions.
You also specify the degree n which is restricted to 1 <= n <= 50 !
Then either you give a list of your (x,y)-data or
specify the interval from where the abscissae should be drawn.
In this case you also have the choice how to distribute the abscissae in this interval: either equidistant (a common case in practice), hence
a+i*(b-a)/n
or in an optimal way (as far as it is independent of f), namely
(a+b)/2+((b-a)/2)*cos(((2*i+1)/(2*n+2))*π),
the so called Chebyshev abscissae, which are obtained by solving
min_{x₀,...,x_n in [a,b]} max_{x in [a,b]} | Π _{j=0 to n}(x-x_j) |.

Output

In case of own discrete data a plot, showing these data and the interpolating function g and a printed formula of g using the generalized Horner's scheme.
If you have generated the data artificially, then you get a plot of f and one of the error curve f(.) - g(.) .

Questions?!

For which functions decreases the error with increasing degree and why?
For which functions this appears just opposite (divergence) and why?
What appears if you take the Chebyshev abscissae ?
What takes place of you try a nonsmooth function?
How affect different choices of the shift the total error?

To the input form

Back to the top!

18.02.2015