PDA

Click to See Complete Forum and Search --> : calculation probability


Muddy
May 28th, 2009, 01:24 PM
Lets say I have N buckets and M marbles.

How can I calculate the maximum number of ways to satisfy different conditions? example:

how many ways can I arrange 6 marbles in 4 buckets such that

1 bucket has 1 marble,
another bucket has 1 marble,
another bucket has 2 marbles,
another bucket has 2 marbles

THe marbles are unique (eg marbles 1 and 2 in bucket 1 is not the same as marbles 1 and 3 in bucket 1 even though its still 2 marble in one bucket either way)

I need a general solution to this for any M and any N and any arrangement.

This is a small peice of a bigger probability problem I am working on. Any help greatly appreciated.

jemidiah
May 28th, 2009, 03:42 PM
Do the following: choose one of your six marbles for bucket 1; choose 1 of the remaining 5 for bucket 2; choose 2 of the remaining 4 for bucket 3; the final 2 must go in bucket 4. We can calculate how many options this gives using "n choose k" (http://en.wikipedia.org/wiki/Binomial_coefficient) (I'm abbreviating it as nck). This gives us 6c1*5c1*4c2*2c2 = 6*5*6*1 = 180 solutions to your problem.

I'd generalize this to any M and N, but you haven't really specified it.

Also, I've taken the buckets to be unique. If this is wrong, please say so. That is, I've counted marble 1 in bucket 1 and marble 2 in bucket 2 as different from marble 2 in bucket 1 and marble 1 in bucket 2.

Muddy
May 28th, 2009, 04:03 PM
Do the following: choose one of your six marbles for bucket 1; choose 1 of the remaining 5 for bucket 2; choose 2 of the remaining 4 for bucket 3; the final 2 must go in bucket 4. We can calculate how many options this gives using "n choose k" (http://en.wikipedia.org/wiki/Binomial_coefficient) (I'm abbreviating it as nck). This gives us 6c1*5c1*4c2*2c2 = 6*5*6*1 = 180 solutions to your problem.

I'd generalize this to any M and N, but you haven't really specified it.

Also, I've taken the buckets to be unique. If this is wrong, please say so. That is, I've counted marble 1 in bucket 1 and marble 2 in bucket 2 as different from marble 2 in bucket 1 and marble 1 in bucket 2.

Thanks for the reply ... Im digesting it now.

The buckets are unique for my problem ... sorry I should have said that up front

Muddy
May 28th, 2009, 05:47 PM
hmmm, there must be something flawed in my approach. I ultimately want to know the probability of getting a particular mix if I randomly put M marbles into N buckets (each marble has an equal chance of being put into any bucket. For any specific result the probability would (I think) be (1/N)^M ... so if there are 180 different ways to get the "1122" arrangement (in any order marbles/buckets) the probability would be 180*(.25^6) which = 4.4% ... but .... I know this is wrong, because I approximated this problem with a monte carlo simulation and the probability was ~26.3%. The entire slate of results from my MC sim for the 4 bucket 6 marble problem was:

123 35.1%
1122 26.3%
1113 11.8%
222 8.8%
114 8.8%
24 4.4%
33 3.0%
15 1.8%
6 0.1%

How can I duplicate this with an exact analyitcal equation?

Thanks again for the help!

jemidiah
May 28th, 2009, 06:09 PM
I'll explain my answer further. A common technique in combinatorics (http://en.wikipedia.org/wiki/Enumerative_combinatorics) is to count something by considering the number of choices needed to build a model satisfying the constraints.

First, you create an algorithm involving a sequence of choices. This algorithm must use the same number of choices at each step no matter what your previous choices were (otherwise the analysis gets complicated), and it must allow you to arrive at every solution in exactly one way (to prevent over or under counting). Once you have such an algorithm, you must count the number of choices at each step, and multiply them. You can draw a tree to illustrate that they need to be multiplied: if you have to choose between 3 and then 4 things, draw three initial branches, and on each of those draw four secondary branches, resulting in 12 endpoints.

As for "n choose k" or nck as I've written, it's a common symbol in mathematics denoting the number of ways to choose a committee of k people from a pool of n people (a committee is used to denote that order doesn't matter). So, 4c2 is the number of ways to make a committee of 2 people out of 4--numbering the people, possible groups are 12, 13, 14, 23, 24, 34, giving 6 ways.


My algorithm above constructs a satisfying model for your constraints. First, it satisfies bucket 1 containing 1 marble. Then it satisfies bucket 2 containing 1 marble, and so on. I counted the number of choices at each step and multiplied them to get the answer of 180.

Edit: will respond to your most recent post later tonight.

jemidiah
May 28th, 2009, 07:35 PM
Given 4 numbered buckets and 6 numbered marbles, the number of ways to place those marbles is 4^6; for each marble, choose which bucket it goes in. This gives 4096 arrangements, so the probability of a random arrangement satisfying constraint 1122 is 180/4096 = 4.4%, as you said.

I wrote a Monte Carlo simulation to test this,


Private Type Bucket
numMarbles As Long
End Type

Private Sub Form_Load()
Dim buckets(1 To 4) As Bucket
Dim i As Long, j As Long, yes As Long

Randomize
For i = 1 To 10000
'reset buckets
For j = 1 To 4
buckets(j).numMarbles = 0
Next j

'place marbles randomly
For j = 1 To 6
b = Int(Rnd * 4) + 1
buckets(b).numMarbles = buckets(b).numMarbles + 1
Next j

'check if it satisfies our constraints
yes = yes + Satisfies(buckets)
Next i

'return the fraction of successes
MsgBox yes / 10000
End Sub

Private Function Satisfies(buckets() As Bucket) As Long
If buckets(1).numMarbles = 1 And buckets(2).numMarbles = 1 And _
buckets(3).numMarbles = 2 And buckets(4).numMarbles = 2 Then

Satisfies = 1
Else
Satisfies = 0
End If
End Function


It gives the expected 4.4% success rate. Perhaps I'm misunderstanding some part of the setup?

Muddy
May 28th, 2009, 08:00 PM
Given 4 numbered buckets and 6 numbered marbles, the number of ways to place those marbles is 4^6; for each marble, choose which bucket it goes in. This gives 4096 arrangements, so the probability of a random arrangement satisfying constraint 1122 is 180/4096 = 4.4%, as you said.

I wrote a Monte Carlo simulation to test this,


Private Type Bucket
numMarbles As Long
End Type

Private Sub Form_Load()
Dim buckets(1 To 4) As Bucket
Dim i As Long, j As Long, yes As Long

Randomize
For i = 1 To 10000
'reset buckets
For j = 1 To 4
buckets(j).numMarbles = 0
Next j

'place marbles randomly
For j = 1 To 6
b = Int(Rnd * 4) + 1
buckets(b).numMarbles = buckets(b).numMarbles + 1
Next j

'check if it satisfies our constraints
yes = yes + Satisfies(buckets)
Next i

'return the fraction of successes
MsgBox yes / 10000
End Sub

Private Function Satisfies(buckets() As Bucket) As Long
If buckets(1).numMarbles = 1 And buckets(2).numMarbles = 1 And _
buckets(3).numMarbles = 2 And buckets(4).numMarbles = 2 Then

Satisfies = 1
Else
Satisfies = 0
End If
End Function


It gives the expected 4.4% success rate. Perhaps I'm misunderstanding some part of the setup?

I'm just not stating the problem clearly/correctly probably. The condition Im interested in gets satisfied no matter what the order:
1122
1221
2112
1212
2121
as long as there are 2 in any bucket, 2 in any other, 1 in any other, and 1 in any other the condition for which I want the probability is met.

thanks for all of the help!

Muddy
May 28th, 2009, 08:05 PM
hmmm... since there are 6 ways to get there:
1122
1221
2112
1212
2121
2211

and each is 4.4% probability

The probability Im looking for is 6 * 4.4% = 26.4 which agrees with my MC sim!

How do I put this in terms of a general solution for any M and N combination though??

Thanks again for the help!

Muddy
May 28th, 2009, 08:28 PM
what I essentially need is a sub

Probs(M as integer, N as integer, Result() as variant)

used like:

call Probs(6,4,Result)

that, using an analytical equation(s) would return for any M, N:

Result(1,1) = 123 Result(2,1) = 35.1%
Result(1,2) = 1122 Result(2,2) = 26.3%
Result(1,3) = 1113 Result(2,3) = 11.8%
Result(1,4) = 222 Result(2,4) = 8.8%
Result(1,5) = 114 Result(2,5) = 8.8%
Result(1,6) = 24 Result(2,6) = 4.4%
Result(1,7) = 33 Result(2,7) = 3.0%
Result(1,8) = 15 Result(2,8) = 1.8%
Result(1,9) = 6 Result(2,9) = 0.1%

With your help I can see now how to hand solve this, but still cant figure out a way to generalize it into a subroutine that will take any M,N and return probs for all possible outcomes.

jemidiah
May 28th, 2009, 10:30 PM
Now I get it. That's what I was worried about in the first post. You'd have to count the number of distinct arrangement of buckets, as you've done. For 1122, you'd choose 2 of the 4 spots for the 1's, choosing 2 of the remaining 2 spots for the 2's. This gives 4c2 = 6 arrangements.

I can generalize this all for you, and will edit this post shortly.

jemidiah
May 29th, 2009, 04:24 PM
Got busy last night. Anywho, you'll need a better data structure than a string in the format we've been using to store your constraints, since you'll run out of digits for large M. I'm just going to take the constraints to be an array from 1 to M, where the ith item denotes the number of buckets with i elements.

To calculate the number of arrangements with unique buckets (as in my first post), do the following. I'm assuming that all marbles are used in the constraints, since those are the only situations you've given.


oArrange = 1 'number of ordered arrangements; multiplicative identity
marbles = M 'number of marbles left
For i = 1 To M
Do Until Constraint(i) = 0
'Satisfy one bucket of this constraint by placing i of the remaining
' marbles into it, yielding a choice between (marbles choose i)
' options
oArrange = oArrange * Choose(marbles, i)
marbles = marbles - i
Constraint(i) = Constraint(i) - 1
Loop
Next i


For the 1122 example, Constraint would be an array from 1 to 6, with Constraint(1) = 2, Constraint(2) = 2, Constraint(3...6) = 0. This would take 1*Choose(6, 1)*Choose(5, 1)*Choose(4, 2)*Choose(2, 2) = 180 as in my first post. Different data types would work fine; I chose something convenient.

To count the number of arrangements without unique buckets (i.e. unordered), count the number of bucket orders with the following:


bArrange = 1 'number of bucket arrangements; multiplicative identity
spots = N 'number of bucket spots left
For i = 1 To M
'For the Constraints(i) buckets with i items in them, choose that
' many places out of the number of spots left
bArrange = bArrange * Choose(spots, Constraints(i))
spots = spots - Constraints(i)
Next i


Again with the 1122 example, this would give 1*Choose(4, 2)*Choose(2, 2)*Choose(0, 0)*Choose(0, 0)*Choose(0, 0)*Choose(0, 0) = 6. Multiply bArrange and oArrange to get uArrange, the number of unordered arrangements satisfying your constraints. Take N^M as tArrange, to get the total number of possible arrangements, and then your probability is just uArrange/tArrange. I'm sure you can fill in the details to fit this into your Result() array scheme.

I have one lingering question: do you need to also generate all possible constraints for a given N and M? That's harder.


The Choose function implements the standard binomial coefficient function nCk. Specifically, nCk = n!/(k!*(n-k)!) where ! denotes factorial. Note that by convention 0! = 1. To recursively define factorial, use that base case and the fact that n!=n*(n-1)!.

Muddy
May 29th, 2009, 09:40 PM
...
I have one lingering question: do you need to also generate all possible constraints for a given N and M? That's harder.
...

I'm working on that peice now ... I might end up brute forcing it (loop all possible combos keeping only the ones that add up to my number) if it executes fairly quickly.

Thanks for all the great help.

Nice to see someone else still around from early 2000's btw!

jemidiah
May 30th, 2009, 01:20 AM
Looping shouldn't be terrible. A version I have in mind feels almost optimal. I'll think about it more if you need help.

Yeah, it's been a while on these forums. I'm sad when I think VB6 will be obsolete soon.