(I have coded a brute-force solution to the following, but am wondering if there is a more elegant path.)
Students in my high school were offered 8 electives. They chose their first, second, third and fourth choices. As there will be two elective periods, each will be able to go to two of their chosen electives.
It has not yet been determined which electives will be scheduled on which of the two periods, however. The scheduling problem is, knowing the preferences, to optimize the allocation of electives to the two days so that as many students as possible will be able to take both of their preferred electives.
For example, if I signed up for IT and Art, I can only take both if they are scheduled different periods. Of course, you might have signed up for Art and French, and someone else for French and IT.
As only two periods are available, it won’t be possible for everyone’s choices to be on different periods. How can we find the optimal allocation of electives to days that allows as many people as possible to get their two top choices?
My brute-force solution was to create a list of all possible allocations (using a combinatorics package) and test each. This works fine, as the data-set is small, but there surely must be a graph theory or matrix solution that is much more elegant—-and doesn’t increase resources in proportion to n! (n being the number of classes offered)!
The following imports a matrix structured as follows:
Course1 Course2 Course3 ...
Course1 ___a_______b_________c
Course2
Course3
Each cell shows how many students chose that particular pair as their 1st and 2nd choices. For example, the value of b is the # of students who chose Course1 and Course2 (in any order) as their top choices. (The matrix is diagonally symmetrical.) The top row of names is in the file and used to get the course names. The first column of course names does not appear in the file, however. ...
from itertools import combinations
from math import floor, ceil
from pandas import read_excel
def takesuccesses(entry): #Returns the number of successes for each recorded result
return entry[0]
def complement(set1): #Returns a list of all items not in set1
set2 = []
for i in Set:
if i not in set1: set2.append(i)
return set2
def courselist(set): #Translates a list of numeric codes to a string of course names
names = ""
for i in set:
if i is not StudyHall: names+=courses[i]+", " #Exclude Study Hall
return names[:-2]
def GenerateList(): #Creates a list of possible groupings of courses to meet first elective period
SetList = []
for setsize in range(min,max+1): #try various numbers of classes meeting per elective period
for ComboSet in combinations(Set[:-1], setsize): #Avoid duplicate solutions by always placing last course in second set
if StudyHall not in ComboSet: SetList.append(ComboSet) #Ensure Study Hall is not in set (avoids duplicate solutions)
return SetList
def count2d(matrix): #total all entries in a 2d matrix
total = 0
for row in matrix:
for entry in row: total+=entry
return total
for qn in range(1,5): #Cycle through all four quarters
quarter = "Q"+str(qn)
#import matrix tracking how many students chose each combination of electives
matr = read_excel("WithoutScience.xlsx", sheet_name=quarter)
matrix=matr.values
courses = list(matr.columns)
StudyHall=courses.index("Study Hall")
n= len(matrix) #Pick up number of courses offered from matrix
min = floor((n-2)/2) #Set a min number of courses per elective period
max = ceil((n+3)/2) #Set a max number of courses per elective period
Set = range(n) #list of all possible courses by number
SetList=GenerateList() #Create a list of plausible groupings of offerings
results = [] #List to record the total successes for each possible grouping
for set1 in SetList: #Check,for each combination, how many people can take both their 1st and 2nd choice
successes = matrix[StudyHall][StudyHall] #Start by counting anyone who chose two Study Halls as a success
#Cycle through cells in one half of the symmetrical matrix
for i in range(n-1):
for j in range(i+1,n):
#Check if these courses meet on different days or if one is study hall
if i==StudyHall or j==StudyHall or (i in set1 and j not in set1) or (j in set1 and i not in set1):
successes += matrix[i][j] #Students can take both these courses
results.append((successes, set1))
StudentCount = int(count2d(matrix)/2)
#Put the best combinations up front and print all optimal results
results.sort(key=takesuccesses, reverse=True)
besttotal, bestset = results[0][0],results[0][1] #The first optimal combination
firstbest=besttotal #Any solutions with this same value are also optimal
i=0
print("START OF QUARTER", qn)
while firstbest==besttotal: #Search for more optimal combinations
print("In", quarter, besttotal,"of", StudentCount, "students can take both their elective choices by grouping\n", courselist(bestset), "\n and\n", courselist(complement(bestset)), "\n")
i+=1
besttotal, bestset = results[i][0],results[i][1] #Load next possible optimal solution
print("END OF QUARTER", qn, "\n")
There are many variations on this problem. It's easy enough to do as a linear program with binary assignment variables for student-class-period triples. This variation places higher weight on classes that a student has ranked as being more preferred.
(4 + 3)*5 == 35 so all students have their top choices. A more realistic scenario where some compromises need to be made due to the number of students and diversity of votes:
In this scenario, all eight classes are scheduled, and the student preference metric is 93.6% where 100% would mean everyone got their top two choices.
The LP approach is quite flexible - you can apply other constraints such as minimum or maximum class sizes, number of classes scheduled per day, student class eligibility, etc.