MathCAD

Printable View

Apr 16th, 2001, 11:47 AM

I remember some of us had posted about MathCAD, so here's a question.

Starting from a set of given data (about six scalars), I'd like to perform some calculations and automatically populate a matrix.

The matrix is for my previous post "polynomial curve fit" that I had intended to do with a programming language, but now I'd like to try it in MathCAD.

When I "perform some calculations", I'll have 50 data points for which I'd like to compute the polynomial that fits them.

So given (x0,y0), (x1,y1), ... (x49,y49), I want to populate a 50x51 matrix to look like this:

1 x0 x0^2 x0^3 ... x0^49 y0
1 x1 x1^2 x1^3 ... x1^49 y1
...
1 x49 x49^2 x49^3 ... x49^49 y49

So I'll need to compute the columns that correspond to higher powers (i.e. ^2, ^3, ..., ^49) and enter them into a 50x51 matrix.

How can this be done automatically? The only manual interaction I want is to enter the six initial scalars. I want the final result automatically from MathCAD.

1) I was thinking of some kind of loop (an "until" loop?) to write the columns to a file for each power.
2) Then read in the matrix from the file (but I think this will need to be transposed due to the order it was written?)
3) Transpose this 51x50 matrix into a 50x51 matrix?
4) Use MathCAD's ability to solve the matrix for the solution vector.

Heck, I'm assuming MAthCAD can do step 4 and I just read about file input/output. I'm using version 4.0, believe it or not.

Will this technique work?
Do you have a better idea?

Maybe there is an easier way for MathCAD to give me the polynomial without the steps I've added.

Thanks.
Apr 16th, 2001, 10:09 PM
Guv

A few thots.

VirtuallyVB: There is no way that I would try for a 49th order polynomial curve fit. There is hardly ever a reason for trying more than a 2nd to 5th order fit. Furthermore, the system of linear equations for polynomial curve fitting tend to be ill conditioned (jargon for causing numerical problems when you try to solve them).

The higher the order of the fit, the more ill conditioned the system. The behavior is analogous to chaotic systems. The solution vector is incredibly sensitive to the input data. Since the data is empirical in the first place, the resulting fit is questionable anyway. Slightly different data would result in radically different coefficients. All the literature on this subject recommends that the order of the fit be much less than the number of data points, so a 49th order fit for 50 data points is a bad idea. Even a 9th or 10th order fit is likely to be excessive.

I just looked the following up in my manual, so I do not know a lot about it. MathCad7 will do a polynomial curve fit. You provide a vector of x values, a vector of y values, and the order of the fit. From what I have just read, it does not provide the coefficients. Instead it allows you to use a function which gives back interpolated y values for a given x argument.

If you want the coefficients, it looks as though you do a lot of the work yourself. Populate the matrix for a linear system and use the MathCad capability for solving it.

I have never used any of MathCad7's programming capabilities. I use it like a high powered calculator with spreadsheet like capabilities.

I have used it to solve small systems of linear equations, invert small matrices, and compute small determinants. I do not know what the upper limit is on dealing with such linear systems. A quick look at my manual suggests that it can handle a lot more that a 50 by 50 system, so you could try for your high order fit. Memory is the limiting factor. There are problems if you try to display large systems. The manual talks about providing for scrolling over large systems.

If I had to do a curve fit for 50 data points, I would not try for more than a 4th to 7th order fit, and would be likely to use a 3rd or 4th order fit. I would want to plot the data first.

MathCad7 allows for defining both vectors and range variables, which I have never used. Once you enter the data, you can plot it, use it in summations, and generally manipulate the data like a VB one dimensional array.

If I were doing a polynomial curve fit, I would enter the data into a vector (array, I think) or range variable. I am not sure which, and perhaps they can be transformed into each other. Once entered, I would plot the points in order to decide on the order of the polynomial fit I wanted. The overall shape should give a clue as whether an odd or an even ordered fit is likely to be better.

I would probably use the summation capabilities to assign values to variables to be used to populate the matrix defining the linear system. For a 3rd or 4th order fit, this would not be too bad a method. For a higher order fit, I would read the manual and figure out how to program the populating of the matrix.

Once the data was entered and plotted. I would try a 2nd or 3rd order fit, and direct MathCad to plot the resulting polynomial on a graph with the plot of the data points. If the fit looked good, I would leave well enough alone. If not, I would try a higher order fit and plot those results.

Once the data is entered, it does not take much time or effort to do additional curve fits.
Apr 18th, 2001, 09:20 AM

From what Sam said, I'm expecting this to reduce to 2nd or 4th degree (or is that "order") automatically since I'm expecting the following to be the fit:

a0 + a1x + a2x^2

a0 is constrained to be zero.

I've never seen that happen, but it makes sense: a3 through a49 should solve as zeroes.

Aside: Is "Order" equivalent to "Degree + 1" or vice-versa?

If that system should fit a quadratic equation, will there still be that ill-conditioning if I setup the 50x51 matrix I mentioned?

I think MathCAD version 4.0 has the 50 row limit when creating, but can read in a larger matrix to be manipulated. I don't think there was a polynomial fit function, just a least squares linear fit.

Maybe after I solve one, all I'll need is to just populate a 50x4 matrix like this:
1 x0 x0^2 y0
1 x1 x1^2 y1
...
1 x49 x49^2 y49

As far as ill-conditioning, when I was testing my Java Reduced Row Echelon code using data type "double", I'd get "n.999999999999999" or "m.000000000003". I guess I would have preferred "m.000000000003" (don't count the places--I'm just guessing, but the exact value should have been n or m). Or maybe that is simply called "round-off error"?

Hey, what do you enter to have version 7 perform a polynomial curve fit? (Just to make sure that version 4 really doesn't have it)
Apr 18th, 2001, 07:44 PM
Guv

MathCad & Curve Fit thots.

I am not sure of the correct jargon. When I refer to a 2nd or 3rd order polynomial curve fit, I mean that I expect to have a quadratic or cubic polynomial respectively for the approximating function, which requires solving for 3 or 4 equations in 3 or 4 unknowns. I imagine that some use the number of equations as the order of the fit.

While there are other curve fitting techniques, I was referring to least squares methods.

a0 is constrained to be zero? This is another way of saying that the approximating curve must pass through the origin. I am not sure what this implies for the equations to be set up for the least squares method. Perhaps you should perform some transformation on the data, do the least squares fit, and then adjust the coefficients obtained. I have no ideas or special knowledge about this. It just seems to me that special case situations usually require some kind of special methods.

BTW: For some types of data, a polynomial is not a good approximating function. I know very little about when and how to use other approximating functions. I am sure there is literature on the subject. Plotting the data points first is always a good idea. It can give a clue to whether a polynomial approximation is a good idea, and often strongly suggests a polynomial of odd rather than even order (or vice versa) be used).

I assume that you are aware that an approximating function should not be used to extrapolate beyond the range of the data points. I am not certain, but I think least squares and other methods tend to get better results inside the range of the data points by generating a function which very poor outside the range. I think there are special methods if you want to extrapolate beyond the extreme data points. I am not sure about this.

MathCad7 has the following function: regress(VectorX, VectorY, n), where n is the order of the polynomial to be used to fit the data specified by two vectors.

The function: interp(VectorS, VectorX, VectorY, X) is used to obtain Y values for a given X. The manual is not explicit about how to use these functions. I believe you do the following, after making suitable declarations of the variables.

VectorS = regress(VectorX, VectorY, n)
Y = interp(VectorS, VectorX, VectorY, X)

The interp function seems strange to me. Why does it need VectorX & VectorY as arguments?
There is no explanation of what VectorS is, or how many elements it contains. It might contain the polynomial coefficients. If this is the case, then VectorX & VectorY are superfluous arguments.

I have never used the above, and I have neither researched the manual nor the help files. They might supply more explanation. For my purposes, I would set up the simultaneous equations, solve them, and use the resulting polynomial coefficients. The only reason I would use MathCad7 for curve fitting would be to get a polynomial for use in a VB program or a program for my HP calculator.

I would not expect all the high order coefficients to be zero if you try for a very high order polynomial fit. Perhaps it would happen if the data points were an exact fit to a low order polynomial. I think Sam was referring to trying (for example) to use a 5th order polynomial fit when there are only 4 data points, which determine a specific cubic, or 3 data points, which determine a specific quadratic. Even in these cases, without trying it or doing some analysis, I would not be certain that you would always get zero for the high order coefficients.

Ill conditioned refers to something worse than roundoff error, which is what you described in a previous post. In solving simultaneous linear equations and/or inverting matrices, there are many instances of computations like the following.

A23 = A23 - A13*A21 / A11

If the above subtraction involves values which are equal to each other in the first 6-13 significant digits, you have just lost 6-13 digits of precision. If a matrix is ill conditioned, this type of precision loss can occur many times, when once or twice would be bad enough. In such situations, the final results are determined by the values of the last 1-2 digits of the original matrix, and ordinary roundoff error finishes the job of making the results meaningless. If the result of such a subtraction is later used as a row multiplier (or worse a row divisor), the loss of precision affects a large number of elements in the matrix.

Polynomial curve fitting has a strong tendency to have the above problem when trying for high order curve fits.

A numerical analysis tip. You should program polynomial evaluation using the following approach (Cubic used as an example).

[(a3*x + a2)*x + a1]*x + a0

It strongly tends to result in better precision that an approach which calculates x^3 and x^2 explicitly, and then adds up the four terms. If the coefficients are put into an array, it obviously makes for easy code using a For/Next Loop. The programming ease is a secondary consideration to the precision effects for high order polynomials. I even program quadratic evaluation as (a*x+b)*x+c.
Apr 20th, 2001, 10:50 PM

Thanks.

Thanks Guv.
Apparently version 4 does not have this cool "regress" function to give a polynomial curve fit. But I guess that's what I'm trying to write.

The version 4 documentation states for "interp(vs, vx, vy, x)":

Quote:

Uses the data vectors vx and vy, as well as the second-derivative vector vs, to return the interpolated y value corresponding to the argument x. You generate the vector vs by using either lspline, pspline, or cspline on the vectors vx and vy.

Again, The vector vs contains the second derivatives of the fitted curve at the points in question.

vs = lspline(vx, vy) -- straight line
vs = pspline(vx, vy) -- parabola
vs = cspline(vx, vy) -- cubic curve

and now "vs = VectorS = regress(VectorX, VectorY, n)" with version 7.

So, I guess that the interp function is a general (modular type) function, instead of MathSoft writing a
linearInterpolation(), parabolicInterpolation(), cubicInterpolation(), and regressionInterpolation() function.

Anyway, I thought all of this is considered "regression". Feel free to clarify.