MathCad & Curve Fit thots.
I am not sure of the correct jargon. When I refer to a 2nd or 3rd order polynomial curve fit, I mean that I expect to have a quadratic or cubic polynomial respectively for the approximating function, which requires solving for 3 or 4 equations in 3 or 4 unknowns. I imagine that some use the number of equations as the order of the fit.
While there are other curve fitting techniques, I was referring to least squares methods.
a0 is constrained to be zero? This is another way of saying that the approximating curve must pass through the origin. I am not sure what this implies for the equations to be set up for the least squares method. Perhaps you should perform some transformation on the data, do the least squares fit, and then adjust the coefficients obtained. I have no ideas or special knowledge about this. It just seems to me that special case situations usually require some kind of special methods.
BTW: For some types of data, a polynomial is not a good approximating function. I know very little about when and how to use other approximating functions. I am sure there is literature on the subject. Plotting the data points first is always a good idea. It can give a clue to whether a polynomial approximation is a good idea, and often strongly suggests a polynomial of odd rather than even order (or vice versa) be used).
I assume that you are aware that an approximating function should not be used to extrapolate beyond the range of the data points. I am not certain, but I think least squares and other methods tend to get better results inside the range of the data points by generating a function which very poor outside the range. I think there are special methods if you want to extrapolate beyond the extreme data points. I am not sure about this.
MathCad7 has the following function: regress(VectorX, VectorY, n), where n is the order of the polynomial to be used to fit the data specified by two vectors.
The function: interp(VectorS, VectorX, VectorY, X) is used to obtain Y values for a given X. The manual is not explicit about how to use these functions. I believe you do the following, after making suitable declarations of the variables.
VectorS = regress(VectorX, VectorY, n)
Y = interp(VectorS, VectorX, VectorY, X)
The interp function seems strange to me. Why does it need VectorX & VectorY as arguments?
There is no explanation of what VectorS is, or how many elements it contains. It might contain the polynomial coefficients. If this is the case, then VectorX & VectorY are superfluous arguments.
I have never used the above, and I have neither researched the manual nor the help files. They might supply more explanation. For my purposes, I would set up the simultaneous equations, solve them, and use the resulting polynomial coefficients. The only reason I would use MathCad7 for curve fitting would be to get a polynomial for use in a VB program or a program for my HP calculator.
I would not expect all the high order coefficients to be zero if you try for a very high order polynomial fit. Perhaps it would happen if the data points were an exact fit to a low order polynomial. I think Sam was referring to trying (for example) to use a 5th order polynomial fit when there are only 4 data points, which determine a specific cubic, or 3 data points, which determine a specific quadratic. Even in these cases, without trying it or doing some analysis, I would not be certain that you would always get zero for the high order coefficients.
Ill conditioned refers to something worse than roundoff error, which is what you described in a previous post. In solving simultaneous linear equations and/or inverting matrices, there are many instances of computations like the following.
A23 = A23 - A13*A21 / A11
If the above subtraction involves values which are equal to each other in the first 6-13 significant digits, you have just lost 6-13 digits of precision. If a matrix is ill conditioned, this type of precision loss can occur many times, when once or twice would be bad enough. In such situations, the final results are determined by the values of the last 1-2 digits of the original matrix, and ordinary roundoff error finishes the job of making the results meaningless. If the result of such a subtraction is later used as a row multiplier (or worse a row divisor), the loss of precision affects a large number of elements in the matrix.
Polynomial curve fitting has a strong tendency to have the above problem when trying for high order curve fits.
A numerical analysis tip. You should program polynomial evaluation using the following approach (Cubic used as an example).
[(a3*x + a2)*x + a1]*x + a0
It strongly tends to result in better precision that an approach which calculates x^3 and x^2 explicitly, and then adds up the four terms. If the coefficients are put into an array, it obviously makes for easy code using a For/Next Loop. The programming ease is a secondary consideration to the precision effects for high order polynomials. I even program quadratic evaluation as (a*x+b)*x+c.