Python Combinations of letters to boxes . boxes can be empty

150 Views Asked by At

I am trying to figure out an optimized Python code that, given a set of letters for example [A,B,C] it will arrange them in n_boxes boxes.

  • each box can have no letters, one letter, or a combination of letters
  • each solution cannot have duplicate letters.

I would like to compute in decent time (<2min??) all the possible solutions. i would say i will probably not have more then 6 boxes and 6 letters.

some examples of desired solutions for a given number of letters and number of boxes, follow in [this image][1]

I already did some tries :

My 1st Try:

I have already tried a powerset followed by permutations and then filter solutions with duplicates. something like this:

combos = power_set(letters)
per = itertools.permutations(combos, n_boxes)
for i in per:
    if solution does not have duplicate:
        yield i

where,

def power_set(iterable):
    "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
    "number os solution =2^len(iterable)"
    s = list(iterable)
    return itertools.chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

but with n_boxes>4 it starts to take to much time..

2nd Try: I also tried something like

for combo in powerset(list(powerset(letters))):
   if there are no duplicate:
      for per in permutations(combo,n_boxes)
         yield per

this allows me to have a bigger number of boxes like 7(as desired) without taking endless time, but i cannot have more then 4 letters...

Thanks for your help [1]: https://i.stack.imgur.com/7oiC2.png

1

There are 1 best solutions below

2
bb1 On

You can try this:

from itertools import product
from string import ascii_uppercase


def f(nletters, nboxes):
    
    letters = ascii_uppercase[:nletters]

    for p in product(range(nboxes + 1), repeat=nletters):
        boxes = [[] for i in range(nboxes + 1)]
        for i, j in enumerate(letters):
            boxes[p[i]].append(j)
        yield boxes[:-1]

For example, list(f(nletters=2, nboxes=3)) gives:

[[['A', 'B'], [], []],
 [['A'], ['B'], []],
 [['A'], [], ['B']],
 [['A'], [], []],
 [['B'], ['A'], []],
 [[], ['A', 'B'], []],
 [[], ['A'], ['B']],
 [[], ['A'], []],
 [['B'], [], ['A']],
 [[], ['B'], ['A']],
 [[], [], ['A', 'B']],
 [[], [], ['A']],
 [['B'], [], []],
 [[], ['B'], []],
 [[], [], ['B']],
 [[], [], []]]

The idea is than every configuration of boxes can be described by a tuple (n_1, n_2, ..., n_m) where 0 <= n_i <= number_boxes and m is the number of letters. Then the letter number k will be assigned to the box number n_k. The range of numbers n_i exceeds the number of boxes by 1, since the last box is used to collect leftover letters not used in all other boxes. This last leftover box is then omitted from the results.