Place different items evenly in array

163 Views Asked by At

I'm wondering if it is possible to somehow "sort" items in an array to place them in "equal" spacings.

An example is more than hundreds of words so:

Apple - 1
Banana - 2
Pineapple - 3
Orange - 4

And this is an array:

[ 'Apple', 'Apple', 'Banana', 'Pineapple', 'Pineapple', 'Pineapple', 'Orange' ]
[ 1, 1, 2, 3, 3, 3, 4 ]

What I want to achieve is something similar to this:

[ 'Apple', 'Pineapple', 'Banana', 'Apple', 'Pineapple', 'Orange', 'Pineapple' ]
[ 1, 3, 2, 1, 3, 4, 3 ] 

With this transformation Pineapple has 1 item offset between other 'Pineapple' and Apple is placed in [0] and [3] placement.

Before I start an implementation I'm looking for an already invented solution - it should be something related with standard deviation?

2

There are 2 best solutions below

2
On BEST ANSWER

Start off by ordering your words by number of occurences. Then iterate over them, first filling up all even indices, then all odd indices.
The first word can at most fill up all even indices. In most modern arrays there should always be at least as many slots with an even index as there are with an odd one. If your language doesn't qualify for that (i.e. one-based arrays), pick even or odd based on the number of available slots.
The second most common word can only occur at most as many times as the most common word, so there's no possibility this way that the same word winds up in two adjacent slots that way.
A simple python-implementation would look like this:

import math

def spaced_ordering(words):
    words = sorted(words, key=words.count, reverse=True)
    output = [None] * len(words)

    for i in range(0, math.ceil(len(words) / 2)):
        output[i * 2] = words[i]

    for i in range(0, math.floor(len(words) / 2)):
        output[i * 2 + 1] = words[math.ceil(len(words) / 2) + i]

    return output

Note: The above implementation is neither exactly performant, nor exactly fancy, nor does it include checking for valid inputs (e.g. what happens if a word occurs more than math.ceil(len(words) / 2) times). It only serves to demonstrate the basic principle.

4
On

The class of algorithm you're looking for is called multiplexing. A multiplexer takes several input streams, and creates a single output stream, selecting one item at a time from the input. There are many different multiplexing strategies. I'll describe one that's easy to implement, and performs well.

The general idea is that each item has a name, rate, and accumulator, and the item with the largest value in its accumulator is chosen next. In the example given in the question, the rates are 2 for Apple, 1 for Banana, 3 for Pineapple, and 1 for Orange. The sum of the rates is the period, which is 7.

The algorithm operates as follows:

initialize all accumulators to 0
for each slot in one period:
    choose the item with the largest accumulator, and add it to the output
    update each accumulator by adding the rate to the accumulator
    subtract the period from the accumulator of the chosen item

The table below shows how the algorithm progresses. The slots are labelled S1 thru S7. For each slot there are two columns of numbers, the accumulator value for each item, and the adjustment to the accumulator.

In slot 1, the Orange is chosen, so the adjustment to the accumulator is +1 -7 = -6 (add the rate, and subtract the period). For every other item the adjustment is equal to the rate. Notice that all the accumulators start at 0, and return to 0 after the seventh slot. Hence, the algorithm could be run for any number of slots, and it would simply repeat the same pattern.

 Name    Rate  __S1__   __S2__   __S3__   __S4__   __S5__   __S6__   __S7__
Orange    1/7   0  -6   -6  +1   -5  +1   -4  +1   -3  +1   -2  +1   -1  +1   0
Banana    1/7   0  +1    1  +1    2  +1    3  -6   -3  +1   -2  +1   -1  +1   0
Apple     2/7   0  +2    2  +2    4  -5   -1  +2    1  +2    3  -5   -2  +2   0
Pineapple 3/7   0  +3    3  -4   -1  +3    2  +3    5  -4    1  +3    4  -4   0

Selected item: Orange    Pine     Apple   Banana    Pine     Apple    Pine

Here's an implementation in Python:

items = ['Apple', 'Apple', 'Banana', 'Pineapple', 'Pineapple', 'Pineapple', 'Orange']

# Convert the list of items into a list that contains the [name, rate, accumulator]
# for each item. The initial value for the accumulator is 0
counts = {}
for item in items:
    counts[item] = counts.get(item, 0) + 1
rates = counts.items()
rates = [[name, rate, 0] for (name, rate) in rates]
rates.sort(key=lambda x:x[1])

# Run the multiplexer, which
#    adds the item with the largest accumulator to the output
#    updates all the accumulators by adding the rate to the accumulator
#    subtracts the period from the chosen accumlator
output = []
period = len(items)
for i in range(period):
    best = 0
    for j in range(len(rates)):
        if rates[j][2] > rates[best][2]:  # compare accumulators
            best = j
        rates[j][2] += rates[j][1]        # update accumulator
    rates[best][2] -= period
    output.append(rates[best][0])         # add an item to the output

print output  # ['Orange', 'Pineapple', 'Apple', 'Banana', 'Pineapple', 'Apple', 'Pineapple']