Birthday Paradox count keeps growing in python

48 Views Asked by At

I need this code to output the total count of matching birthdays, but the count keeps growing and I don't know why. We should expect to see a total approx. half the amount of the simulations. I have it appending to a list at the moment to visualize the change in the count.

import random
birthmonth = []
birthday= []
def birthday_paradox():
    """
    
    """
    months31 = (1,3,5,7,8,10,12)
    months30 = (4,6,9,11)
    day31 = random.randint(1,31)
    day30 = random.randint(1,30)
    day29 = random.randint(1,29)
    month = random.randint(1,12)
    
    for i in months31:
        if month == i:
            birthmonth.append(month)
            birthday.append(day31)
    for i in months30:
        if month == i:
            birthmonth.append(month)
            birthday.append(day30)
    if month == 2:
        birthmonth.append(month)
        birthday.append(day29)

def frequency_of_matches(number_of_people:int = 23):
    """
    
    """
    count = 0
    for i in range(number_of_people):
        birthday_paradox()
    sorted_birthday_list = sorted(list(zip(birthmonth, birthday))) 
    
    for i in range(len(sorted_birthday_list)):
        if sorted_birthday_list[i] == sorted_birthday_list[i - 1]:
            count += 1
    return count
       
    
def simulations(number_of_simulations:int = 100):
    total = []
    for i in range(number_of_simulations):
        total.append(frequency_of_matches())
    return total


simulations(25)

Example output:

[1, 3, 7, 9, 13, 23, 28, 42, 51, 61, 70, 84, 97, 112, 124, 138, 153, 163, 178, 196, 213, 228, 244, 261, 277]
1

There are 1 best solutions below

6
Tim Roberts On

The root problem is that your birthmonth and birthday lists keep growing and growing and growing. You never clear them out. That's why your design -- modifying globals -- is a bad one. The function should start fresh and RETURN its output. Globals are evil.

Also, your method of choosing the random birthdays is loony. Here is a replacement that only needs one random number, and only uses one global, which is a constant.

import random
import datetime

jan1 = datetime.datetime(2022,1,1)

def birthday_paradox():
    base = jan1 + datetime.timedelta(days=random.randint(0,364))
    return base.month,base.day

def frequency_of_matches(number_of_people:int = 23):
    birthdays = [birthday_paradox() for i in range(number_of_people)]
    birthdays.sort()
    
    count = 0
    for i in range(len(birthdays)-1):
        if birthdays[i] == birthdays[i+1]:
            count += 1
    return count
       
def simulations(number_of_simulations:int = 100):
    total = []
    for i in range(number_of_simulations):
        total.append(frequency_of_matches())
    return total

print(simulations(25))

If you are really, really bothered by the lack of February 29 here, then change to:

    base = jan1 + datetime.timedelta(days=random.randint(0,365*4+1))