-
Jan 4th, 2015, 07:32 AM
#1
[RESOLVED] Gaussian Sigma and Pascal's Triangle
We all know (don't we? ) that the numbers in given row of Pascal's triangle divided by their total give the coefficients of a 1-dimensional Gaussian kernel. How can I derive the Sigma of the corresponding Gaussian from the row number of the triangle, and vice versa?
BB
-
Jan 4th, 2015, 08:04 PM
#2
Re: Gaussian Sigma and Pascal's Triangle
You seem to be referring to the fact that the rows of Pascal's triangle, when plotted after normalization, look very much like a normal distribution, especially as you take longer and longer rows. You seem to be asking for the parameters of that distribution, in particular the variance.
First, to explain this phenomenon, consider flipping a fair coin N times and add up the number of times heads came up. What's the probability that there have been K heads? Well, essentially we need to count the number of ways of distributing K heads (so N-K tails) in a list of size N, which is one definition of the binomial coefficient (N choose K). Of course to compute probabilities we need to divide by 2^N, the total number of ways the coin flips can come out.
If N is fixed, varying K then moves us along the Nth row of Pascal's triangle (before dividing by 2^N, at least), connecting coin flips to Pascal's triangle. On the other hand, we more or less added a bunch of "independent identically distributed" random outcomes together to get these probabilities. The famous Central Limit Theorem in probability tells us that in this situation, essentially regardless of the distribution of the random process, as you add more and more trials together the result approaches a Gaussian distribution.
Here's a rough quantitative version of the theorem. Given N independent identically distributed random variables with mean mu and variance sigma^2, then the "shifted, normalized average" sqrt(N) * (average of the trials - mu) is approximately normally distributed with mean 0 and variance sigma^2. In our case mu=1/2 and sigma^2=1/4, so roughly the distribution of x=sqrt(N) * (average of the trials - 1/2) is approximately sqrt(2/pi) e^(-2x^2). Since we're interested in the sum of the trials rather than the average, we rearrange this equation a bit to get...
(N choose K)/2^N
= probability the sum of trials is K
~= sqrt(2/pi) e^(-2 (K - N/2)^2 / N)/sqrt(N)
= 1/(sqrt(N/4) sqrt(2pi)) e^(-(K-N/2)^2 / (2 N/4))
(The extra sqrt(N) shows up on account of the variable substitution. It's required to keep the total probability 1.)
In all, (N choose K)/2^N is approximately a Gaussian distribution with mean N/2 and variance N/4, so standard deviation sqrt(N)/2.
Warning: the Central Limit Theorem uses shifted, normalized averages for a good reason. Letting K=N/2, our approximation says (N choose K) ~= sqrt(2/pi) 2^N/sqrt(N). And indeed, at N=1000 we have (1000 choose 500) = 2.70288... * 10^299 and sqrt(2/pi) 2^1000/sqrt(1000) = 2.70355... * 10^299. The trouble is that after we've done our variable substitution, the relative error away from the mean gets enormous. Using N=1000, K=200, the actual value of (1000 choose 200) is 6.617... * 10^215, whereas our approximation is 5.74022... * 10^222. This is seven orders of magnitude off. On the scale of 10^299, this error is incredibly small, but on the scale of 10^215, this error is enormous. The shifted, normalized averages more or less use 10^299 as the scale.
The time you enjoy wasting is not wasted time.
Bertrand Russell
<- Remember to rate posts you find helpful.
-
Jan 5th, 2015, 07:46 AM
#3
Re: Gaussian Sigma and Pascal's Triangle
Thanks Jemediah. That's a terrific explanation of the theory. The most important result for me is:
sigma = sqrt(N)/2
I had tried working out the SD of a few rows of Pascal's Triangle by hand but I must have been making errors because the results didn't look very convincing.
My primary interest is finding the coefficients of a convolution kernel for blurring images, e.g. a N=5 kernel would be
[1 4 6 4 1]/16
I guess accuracy isn't too critical for this purpose since blurring is only pseudo-random anyway. I could calculate the exact coefficients using the Gaussian formula (or maybe just find them somewhere on the web) but I liked the idea of making a simple function to return the values for a given sigma based on Pascal's triangle. More importantly, it will help me understand discussions of smoothing in terms of sigma.
BB
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|