Re: Any Statisticians here?
Gary,
The first thing I would say about your approach is that I must assume that you do not have an equal number of production hours each month, (ie holidays, weekends, vacation trends...) and you would apparently not want these factors affecting what you are trying to deduce. I also assume that there is some trend in the data, hopefully, you are always producing more widgets! And I assume that there are both cyclical and seasonal factors. Take car production, for example....if you analyze the data over decades, you will find a trend, a cycle, a seasonal and a random component to their production.
So, I would "normalize" my production number by some measure of available production...paid production worker hours (not actual pay which has inflation and overtime issues...just the hours spent producing), perhaps? After all, if I make 22k widgets on 2200 hours month 1 but 23k widgets on 2400 hours month 2 I actually have a DECREASE in productivity.
Then you would have 22/2200 = .010000 for month 1 and 23/2400 = .009583 for month 2. This would theoretically, eliminate the need to "seasonally adjust" your numbers. This would also probably eliminate the need to correct for a "trend" or "cycle" in the data. Because now you are no longer talking about "production", you are instead talking about "productivity".
Once your data is normalized, I would say that although it is appealing, it is probably counter-productive to try to come up with an equation that predicts productivity for the next month based on prior months, no matter how much data you have. This is something of a theoretical issue.
If we normalize the data and therefore do not have a trend or cyclical/seasonal effect in the data, we should be left with only productivity itself and random influences.
If you want to determine if one month's productivity might affect another month there are a number of "nonparametric" statistics that you might employ based on the nature of your situation.
First, we need to establish what your "Null Hypothesis" is. Then we gather statistics to see if we have enough evidence to reject that hypothesis.
It sounds like one question you are attempting to answer is: "Does high productivity in a given month lead to higher or lower productivity in the following month?"
I might try something like a simple "runs" test where I arrange my normalized data in order by month, then I assign a + if the productivity is up from the previous month or - if it is down.
Then you state the Null Hypothesis that "We assume that productivity is random (there will be no patterns to the +- data), is there sufficient evidence to conclude that we must reject that hypothesis?"
Then you count the number of "runs" in the data. (ie in +++--++-+---- there are 6 runs out of a total possible 13 runs). If the number of runs is fewer than what you would expect by chance, you would say that you have enough evidence to reject the null hypothesis and conclude that there is some dependence between months. If the number of runs is too close to random chance, you have to accept the null hypothesis, you just don't have enough evidence to the contrary to convince the jury.
If you need some help with analyzing the data once you determine the runs, post your results here..or google "Non-Parametric Statistics Runs Test".
-MagicT
Re: Any Statisticians here?
Hy Gary,
The "nonparametric" stats method that I use to check whether a bunch of data is random or not (based on the number of times data change chronologically from increasing to decreasing), tells me that the data are random despite the fact that I should have 26 observations at least to be sufficiently confident on the conclusion. On the other hand, an attempt could be made to find a trend and a cyclical effect using the Holt-Winter formulas, but data are just a few for the conclusions to be statistically significant. If this were not the case, you could estimate the value of the next observation within an interval given a certain level of confidence.
I am sorry if I can’t provide you more help.
Re: Any Statisticians here?
There is another "nonparametric" stats method, known as the Spearman correlation test method, which might be of help in the case you are suspicious that a relation of cause-effect exists between your data above (effect) and other data (cause). If you trace the behaviour of the later, an estimate of the prime could be achieved.