The birthday paradox is that everyone has equal probability of having a birthday on any given of 365 days. We start adding people in a room. What is the probability that 2 people have birthdays on same day as a function of number of people in the room? The code I wrote is as follows:
import numpy as np
import matplotlib.pyplot as plt
x=[0]
y=[0]
for j in range(1000):
if j!=0:
freq = []
L1 = list(np.random.randint(low = 1, high=366, size = j))
result = list((i, L1.count(i)) for i in L1)
for a_tuple in result:
freq.append(a_tuple[1])
print(freq)
rep = j - freq.count(1)
prob = rep/j
y = y + [prob]
x = x + [j]
print(prob)
plt.plot(x,y)
Here, in L1 = list(np.random.randint(low = 1, high=366, size = j))
I select the day on which someone would have a birthday and in result = list((i, L1.count(i)) for i in L1)
I calculate the frequency of birthdays on each day. The entire thing is looped over to account for increasing number of people.
In the following for loop, I isolate the unique events and find repetitions and store the value in rep. Next I calculated the probability as fraction of people sharing birthdays and plotted them as a function of number. However, the question requires me to find the probability of just one shared birthday. How do I calculate that? I think I have to loop this entire thing for number of trials but that just gives an accurate solution with less variations of the same program. Currently my program gives fraction of people having shared birthdays I think.
NOTE
I assume that when
n
persons have been in the room, they are all thrown out of the room and thenn+1
persons enter the room.========================================
I would think of it this way;
First, set
probs = [0]*365
. Now, say 2 persons get in the room - we then write their birthdays onto a piece of paper and check, if those two dates are equal. If they are, we increaseprobs[2]
by 1 (yes, theres some indexes that we don't need, and Python is 0-indexed etc. but to keep it simple).Now do the same for 3 persons, for 4 persons, for 5 persons ... all the way up to 365. Your array might look something like
probs==[0,0,0,0,0,1,0,1,1,0,1,1,1,1,0,1....]
.You can now start over from 2 persons (still keeping the same array as before i.e don't create a new one with 0's!), then 3 persons etc. and start over 1000 times. Your array might look like
probs==[0,0,2,0,4,1,5,2,9,12,10,17....,967,998]
If you divide that array by 1000 (elementwise) you now have your simulated probability as a function of
n
persons.which generates a plot that looks like