efficiently listing distinctive permutations that meet 'adjecancy requirements'

127 Views Asked by At

I'm trying to store distinctive permutations that meet additional (adjecancy) requirements. Below is a figure to illustrate the problem I am trying to solve.

Figure to illustrate the problem

I'm trying to make code that lists all distinctive permutations while taking adjecancy requirements into account (rule 3 in the figure). Two figures that show an acceptable and an unacceptable solution:

example solutions: acceptable and unacceptable

I first decided to model this problem as an array or string, where each index corresponds with a tile on the map (see figure on the bottom for the mapping). e.g. for the 7 blue tiles example:

['B1','R','B2','R','G','G','G','R','B2','R','B2','R','B1','R','B3','R','B3','R']

I then considered listing all distinct permutations, and filtering out infeasible solutions afterwards. For this I tried 'distinct_permutations()' from the library more_itertools. However, (unsurprisingly) this results in 183.783.600 candidate solutions that need to be checked for the given example.

from more_itertools import distinct_permutations 
solutions =
list(distinct_permutations(['T', 'R', 'T', 'U', 'R', 'T', 'I', 'T',
'O', 'T', 'U', 'R', 'T', 'I', 'T', 'O', 'T', 'U']

I was wondering whether it would be possible to code this in a more efficient manner; where adjecancy requirements are taken into account to directly exclude all, or at least a 'large proportion' of the infeasible solutions (instead of filtering them out afterwards).

One idea that I had was to use 'pairs' of tiles (e.g. 'B1+R'). Treating such 'pairs' as if they were single tiles would reduce the number of 'elements' considered in the permutations. However, while this might be helpfull on the edges, the centre of the map would mess things up.

Could anyone help me with finding an approach or solution to code this problem more efficiently? A nudge in the right direction would be greatly appreciated!

mapping of indices

1

There are 1 best solutions below

2
On

If you start generating from blue colors first then place reds next, you don't have to check for collisions and the number of permutations required is decreased. But if you need more performance than that, you can change to another data type. Perhaps 2 bits per tile, 4 tiles per byte, 32 tiles per 64bit integer.

If you don't want to optimize it by generating blues first, then finding collisions on specific indices can be simulated by masked collisions using integers:

  • separate arrays into three
  • blue on/off array --> 32 bit integer for whole tile pattern
  • red on/off array --> 32 bit integer for whole tile pattern
  • green on/off array --> 32 bit integer for whole tile pattern

start

 blue channel:                    0 0 0 0 1 0 0 0 0 (1 blue tile)
                                            
 mask to count red neighbors:     0 0 0 1 0 1 0 0 0 (2 red tiles)
                                         AND
 get red channel                  1 1 1 0 0 1 0 0 0 (only 1 collides)
                                         =
 result                           0 0 0 0 0 1 0 0 0
                                   int.bit_count()   
 bit count =  1 ==> bad

start (check collisions for 2 blues vs 4 reds at once)

 blue channel               :        0 0 0 0 1 0 0 1 0 (2 blues)
                                            
 mask to count red neighbors:        0 0 0 1 0 1 1 0 1 (4 reds)
                                           AND
 get red channel                     1 1 1 1 0 1 1 0 1 (4 collides)
                                            =
 result                              0 0 0 1 0 1 1 0 1
                                      int.bit_count()   
 bit count =  4 ==> acceptable                    
  

So, if 000010010 means 17, then mask[17] from array of masks can be applied on red channel to test the condition.

Something like this:

mask   = masks[blue_channel]
result = (mask and red_channel)
count  = result.bit_count()
count2 = count_masks[blue_channel]
if count == count2:
    print("ok")

Since there are 18 tiles, each constant mask array can have at most 2^18 elements (less than a million). When permutation generator is generating the tile patterns in sorted order, the mask-array access pattern can be cache-friendly.

If this causes too high cache-miss, then each mask can be computed at the cost of few CPU cycles:

blue:                0 0 0 0 1 0 0 0
compute left shift:  0 0 0 1 0 0 0 0
compute right shift: 0 0 0 0 0 1 0 0 
OR the two shifts:   0 0 0 1 0 1 0 0 --> collision mask

blue:                0 0 1 0 1 0 0 0
compute left shift:  0 1 0 1 0 0 0 0
compute right shift: 0 0 0 1 0 1 0 0 
OR the two shifts:   0 1 0 1 0 1 0 0 --> collision mask

but this would work only for the surface tiles. The interior tiles can have 3 neighbors like this:

blue:                0 0 0 0 1 0 0 0 
hash1                1 0 0 0 0 0 0 0
hash2                0 0 1 0 0 0 0 0 
hash3                0 0 0 0 0 1 0 0 
OR                   1 0 1 0 0 1 0 0 --> collision mask

Finding a hash function can be tricky. Perhaps only interior tiles could access a mask array while surface tiles are only computed using bitwise operations. Perhaps, if you reorder tiles in bit positions, you can come up with a pattern that can work with shifting for interior tiles too.

For example, according to your mapping,

  • tile 12 checks: 11, 13, 17.
    • 11: left shift by 1 bit
    • 13: right shift by 1 bit
    • 17: right shift by 5 bits
  • tile 14 checks: 13,15, 3.
    • 13: left shift 1
    • 15: right shift 1
    • 3: left shift by 11 --> not similar to tile 12's operation

it doesn't work immediately. It requires some extra mapping. You can just access a bit-pattern array of 2^6 number of elements for interior vs surface and 2^9 elements for surface vs interior. This fits inside L1 cache for faster testing of third bit pattern.


If problem always asks 6 blue tiles, then this masked-collision method computes 6 of them at once using 4-5 bit-wise operations and 6 array accesses. String version would require 12 array accesses. So it should give performance by accessing memory less.


Simpler approach:

  • tile map looks like 3 hexagons collided
  • 1 byte per hexagon
  • simple AND between center and surface x3 (and opposite)
  • simple AND with self x 3