grouper recipe but return last "group" even with uneven length

378 Views Asked by At

I am getting the hang of the grouper() recipe from itertools:

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

This version fills the last group with a given value. If I remove fillvalue, then it does not return the last group if it has less than n elements. I have encountered several situations where I want the last group whether or not it is the same size as all the other groups. I also do not want to add any padding. How do I go about doing this?

3

There are 3 best solutions below

7
On

I usually use islice:

from itertools import islice

def grouper(iterable, n):
    iterator = iter(iterable)
    group = tuple(islice(iterator, n))
    while group:
        yield group
        group = tuple(islice(iterator, n))

If you prefer, the following is just a slight change in the logic, but it works the same:

def grouper(iterable, n):
     iterator = iter(iterable)
     while True:
         group = tuple(islice(iterator, n))
         if not group:
             return
         yield group

There are other variants here too. For example, if you want a completely lazy version:

from itertools import groupby

def grouper(iterable, n):
    iterator = enumerate(iterable)
    for unused_group_number, idx_items in groupby(iterator, lambda t: t[0] // n):
        yield (item for unused_idx, item in idx_items)

This is likely less performant than islice since it does so much calling back into python, but it is completely lazy (and safe). e.g. it yields iterators over groups and it handles the case where the caller doesn't consume every element in a group iterator.

2
On

You can use islice() in a loop to read the elements in chunks:

from itertools import islice

def grouper(iterable, n):
    it = iter(iterable)
    group = tuple(islice(it, n))
    while group:
        yield group
        group = tuple(islice(it, n))    

print(list(grouper('ABCDEFG', 3)))  # [('A', 'B', 'C'), ('D', 'E', 'F'), ('G',)]
0
On

Since python 3.12, the functionality you are looking for is the builtin function itertools.batched:

https://docs.python.org/3/library/itertools.html#itertools.batched

import itertools

print(list(itertools.batched('ABCDEFG', 3)))  # [('A', 'B', 'C'), ('D', 'E', 'F'), ('G',)]