Suppose there are 30 people in a room.
How do I calculate the probability of at least two of them having a Birthday on the same day of the year?
I always get stuck trying to figure this out.
Printable View
Suppose there are 30 people in a room.
How do I calculate the probability of at least two of them having a Birthday on the same day of the year?
I always get stuck trying to figure this out.
I can´t reach a definitive answer but let me try a few steps which I think are necessary.
Probability of an aniversary of an individual on a particular day: 1/365;
Probability of any 2 individuals having a birthday on the same day: 1/365 x 1/365 = 7.5.10^(-6);
Number of combinations of any 2 individuals out of 30: 30!/(2!(30-2)!) = 435
And now, I wonder how to proceed! 435 x 7.5.10^(-6) = 3.26.10^(-3) perhaps? I don´t think so...
Prob for 1 person to have his birthday today is 1/365.
Prob for two persons (1/365)^2
Prob for the other not to have their birthday today (364/365)^28
total prob (1/365)^2* (364/365)^28
Sorry, the above solution is for EXACT two persons, the question was "at least 2". Let me think........................
The solution for at least two persons having their birthday is (1/365)^2 *(365/365)^28, which is the same as (1/365)^2 .
The part for all the other people can be neglected (Faktor 1), since it doesn't matter if they have their birthday today or not.
Your answer doesn’t seem logical as the 30 people in the room don’t come into play. I think that the more people you have in the room (from which to pick up a sample) the higher the chance to get any two having a birthday on the same day. So the probability has to be higher than (1/365)^2 and tending to 1 has the number of people in the room tends to infinity.
Good point, let me think again.................................. :afrog:
Use the backward approach, instead of looking for the prob that at least 2 people have their birthday, look for none or exact one person has his birthday.
None: (364/365)^30
Exact one: (1/365)*(364/365)^29
Two or more: 1- (364/365)^30 -(1/365)*(364/365)^29
I like that but, before I agree, what would you write down for exactly three people?
My aim is to find a general expression depending on N (number of people in the room) and on n (number of people having their birthday on the same day) and then test for the rationale.
(1/365)^n * (364/365)^(N-n)
Assume 2 people in the room
The probability is then 1/365 not 1/365^2
??
What probabiliy would be 1/365?
Both having birthday on that day (NoNoNo)? Only one (yes)
No, not a particular day, but both the same day
One person on a particular day is 1/365
Two on a particular day is 1/365^2
but two on the same day no matter what day is
1/365^2*365 since there are 365 days in a year
Assuming the above is correct, what is it for 3 people in the room?
Persons A, B and C
A & B = 1/365
A & C = 1/365
B & C = 1/365
so would it be 3/365 minus some overlap?
What is the overlap, it's important or probability could exceed one for more than 365 people
I'm thinking overlap in this case is A = B = C = 1/365^2
so 3 people is 3/365 - 1/365^2
I'm totally making this stuff up, but it makes sense to me.
As a test of these theories, I thought I'd try using VB to make random sets of data and check for repeats.
Code I used (it could very well be mistaken):
VB Code:
Const NumOfPeople = 30 'Requires Text1 and List1 'Debugging bits commented out Private Sub Form_Load() Dim i As Long Dim BDays(1 To NumOfPeople) As Date Dim IsTrueFlag As Boolean Me.Show Randomize For i = 1 To 10000 If i Mod 1000 = 1 Then Randomize: DoEvents: Me.Caption = i ' List1.Clear IsTrueFlag = False For i2 = 1 To NumOfPeople BDays(i2) = DateDiff("d", #1/1/1900#, Rnd * 100000) ' List1.AddItem i2 & ": " & Month(BDays(i2)) & "/" & Day(BDays(i2)) & "/" & Year(BDays(i2)) Next i2 For i2 = 1 To NumOfPeople - 1 For i3 = i2 + 1 To NumOfPeople If Day(BDays(i3)) = Day(BDays(i2)) And Month(BDays(i3)) = Month(BDays(i2)) Then IsTrueFlag = True ' MsgBox i2 & " = " & i3 End If Next i3 If IsTrueFlag = True Then Exit For Next i2 If IsTrueFlag = True Then TotalTrues = TotalTrues + 1 IsTrueFlag = False Text1.Text = TotalTrues * 100 / i Me.Refresh ' MsgBox "True" Else ' MsgBox "False" End If Next i Me.Caption = "Done" End Sub
Now, running with 10000 iterations, I get a rate of around 70% +/- 1%, which seems very high to me. I tried it with 2 people, and got 0.28% +/- 0.4%, which seems correct (1/365.242, accounting for leap years, = 0.273%).
Probability is my worst area of math, so I won't try to solve it analytically :)
Let A be the contrary event. Then:
A=nobody has the same birthday. Then
P(A)=(365/365)*(364/365)*(363/365)*...*(365-n+1).
then 1-P(A)=1-365*364*363*...*(365-n+1)/365^n
(If n>23 then 1-P(A)>0.5!)
Regards
Marco
I've heard that the answer is higher than most people expect.
Marco, could you explain to me how you get P(A)?
Total cases=365^n
Favorable cases (nobody has the same birthday):
The first person has......................365 posibilities
The second person has..................364 posibilities (his birthday is not the same as person 1)
The third person has......................363 posibilities (his birthday is not the same as person 1 and not the same as person 2)
The n person has.........................(365-n+1)
Then favourable cases=365*364*363*...*(365-n+1)
P(a)=favourable cases/total cases.
As a test of all that has been said up to this point, and backing up Jemidiah with his VB code, I tried a simulation model in Excel and after a few repetitions I got the following results:
Probability of 0 = 0,28 +/- 0,06 after 200 repetitions and a confidence level of 95%;
Probability of 2 or more = 0,68 +/- 0,06 after 200 repetitions and a confidence level of 95%;
Probability of 3 or more = 0,022 +/- 0,015 after 500 repetitions and a confidence level of 95%.
You can check my rationale and the above results yourselves by using the attached simulator.
If you don´t agree, please try to convince me that I am wrong!...
Thanks Marco,
You have answered a question that has bugged me for a while.
Marco's development of the solution is logical and if you plug in the numbers it matches our known cases
n=2 = 1/365 =
n=366 = 1
and jemidiah's simulation of n=30 = 70.6% (ignoring leap years)
My Guess for 3 in a room, however is off by a little so something is wrong with my logic there.
next time you're in a room of 30 people you can bet that at least two share the same Birthday!
Despite this thread has been "resolved" I am not satisfied at all with the answer given by Marco and with the last comment from Moeur. I am quite confident with the results I got from the simulation model I attached in my previous post. Results derived from Marco’s expressions don’t coincide with mine and I am still wondering why. Perhaps a text mistake in Marco’s post is the reason...
Any comments would be appreciated.
See for instance:
http://www.mste.uiuc.edu/reese/birthday/
Regards
Marco
Edit note: I fixed my conclusions from 29% to 70%
Well assuming all 366 days are equally likely (which is not, but it will make the answer easier):
Answer is:
x is the number of students in the room.
Here is the set of equations:
For x < 366, 1 - [366!/(366-x)!]/(366^x)
For x = 366, 1 - [366!/(366^366)]
For x > 366, 1
So for x = 30, the answer is 1 - 0.2947.
If there are 30 students in the class, there is a 70.53% chance that at
least 2 of them will have the same birthday.
A little detail:
[366!/(336-x)!]/(366^x) = to no students having the same birthday (given x number of students), so 1 minus that means at least 2 students will have the same birthday.
Quote:
Originally Posted by Rassis
From this site http://www.mste.uiuc.edu/reese/birthday/ recommended by Marco I could finally conclude that my simulation model is correct as it returns the same result (+/-0,7) as explained in the before mentioned site.
If any 30 people meet in the same place, there is a 1 – (365).(364).(363).....(365 – 30 + 1)/(365)^30 = 0,706316 chance that at least two of them have the same birthday (and not 0,2947 as according to Capsulecorpjx – please consider revising your conclusions).
I am satisfied now because I know how to solve the problem both ways…thank you all.
lol, for all my equations, I forgot to do 1 - 0.29.
So my equations are correct, I just didn't execute it correctly for the example of 30 students.
Quote:
Originally Posted by Rassis
Marco is essentially correct, but calculaing the exact probabilities seems more complex. His formula ignores consideration of leap years.
Perhaps use his formulae twice, once for years with 365 days and once to years with 366 days. Then add 75% of the first value and 25% of the second. This might be more precise, but is probably not exact. For those born on 29 February of leap years, we need a precise definition of when and how often they celebrate a birthday. I would not like to spend the time to try for an exact solution to this problem.
This is an old problem. Over the years, I wrote programs for various models of HP calculaltors. Ignoring consideration of leap years, my current calculator (HP 48GX) gives the following using Marco's formula.Sorry if there are any typo's above.
- 22: Probability = .475695 /// Odds = .907288
23: Probability = .475695 /// Odds = .907288
21: Probability = .507297 /// Odds = 1.029621
30: Probability = .706316 /// Odds = 2.405023
40: Probability = .891232 /// Odds = 8.193864
50: Probability = .970374 /// Odds = 32.753656
I hate to say this, but the leap year gets even more complicated than that :)
I can never remember it right, but I think it's once every 400 years you skip a leap year. The actual solar year is closer to 365.242 days, not 365.25.
So, in essence, the decimals don't matter. At least to me
The leap year algorithm is as follows (order of tests is critical).I am not sure if there is a more complex algorithm for years beyond the next few millenniums.
- If year is evenly divisible by 400, it is a leap year.
If year is evenly divisible by 100, it is not a leap year.
If year is evenly divisible by 4, it is a leap year
Note that we lucked out when computers were developed in the twentieth century. There were a lot of programs which checked for divisability by 4. This would have been wrong in 1900 or 2100, and could have caused more problems than the Y2K problem, whose potential effects were grossly exaggerated.
I am mildly obsessive compulsive. 40+ years ago, I wrote programs using the above leap year algorithm, even though checking for divisability by 4 would work until 2100. I also wrote programs which worked with 4-digit dates internally, avoiding the Y2K problem 30-40 years in the my then future.
Perhaps, I was not obsessive compulsive, merely arrogant, expecting my programs to still be in use long after I was dead.
I modified my simulation model to account for leap years (one year with 366 days followed by three with 365 days) and noticed no significant change in the results. It is my belief that probabilities will decrease just by a very small amount.
I don’t think it is worth the time spent to proceed in order to go up in accuracy.
Rassis: Your last post seems right on. I tried my approximation using a weighted average of probabilities for 365 & 366 day years. Probability was same for 3 digits. I suspect that the weighted average is an improvement although not precise.
Guv,
If you use simulation as the the method of your preference, you choose 366 whenever a random number between 0 and 1 results equal or less than 1/4 and 365 in other cases. I think this is very much the same as to weight both figures, if you use an analytical method instead. You must be right. And the results may defer only a little bit from the previous solution (the one not accounting for leap years). Simulation doesn´t allow such a high degree of accuracy but the approach is fast and the rationale is strong.
Well thats all moot considering that people are more likely to be born on say the spring. So you'd have to do a survey on what days are more likely for people to be born on, which is directly related to the fact that people tend to get married during a certain season.Quote:
Originally Posted by jemidiah
To get an exact probability, you need to know what day every person that can be in that class is born on, then appropriate the weights accordingly.
Capsulecorpjx raised an interesting issue. After he ends the "survey on what days are more likely for people to be born on and find the appropriate weights" (!), simulation still seems to me to be an effective way to find probabilities. Or you think it could be achieved analytically too?
You don't need to simulate anything.Quote:
Originally Posted by Rassis
Say you're randomly choosing 30 people from every U.S. citizen right at this moment.
Find the proper weight for each day, depending on what percent of the population each day is the birthday. Plug the weights into the equation (I gave earlier). Then you can just use a calculator to determine the exact probability.
If you just use simulation, you'll always be just a bit off from the theoretical probability.
Right, I partly agree. Notice that you are suggesting sampling (without replacement for sure) from the population which would cause an inherent error whose magnitude would depend on how long the survey might be extended (the sample size) and the desired confidence level. As matter of fact, it wouldn’t be either practical or affordable to cover the whole population. Therefore, “the exact probability” will never be achieved. The same conclusion stands for the simulation which is also statistical based.Quote:
Say you're randomly choosing 30 people from every U.S. citizen right at this moment. Find the proper weight for each day, depending on what percent of the population each day is the birthday. Plug the weights into the equation (I gave earlier). Then you can just use a calculator to determine the exact probability. If you just use simulation, you'll always be just a bit off from the theoretical probability.