After I have shortly succeed in writing a minimalistic Python3.6 extension module in C++ ( see here ) I plan to provide a Python module which does the same as the following Python function iterUniqueCombos():
def iterUniqueCombos(lstOfSortableItems, sizeOfCombo):
lstOfSortedItems = sorted(lstOfSortableItems)
sizeOfList = len(lstOfSortedItems)
lstComboCandidate = []
def idxNextUnique(idxItemOfList):
idxNextUniqueCandidate = idxItemOfList + 1
while (
idxNextUniqueCandidate < sizeOfList
and
lstOfSortedItems[idxNextUniqueCandidate] == lstOfSortedItems[idxItemOfList]
): # while
idxNextUniqueCandidate += 1
idxNextUnique = idxNextUniqueCandidate
return idxNextUnique
def combinate(idxItemOfList):
if len(lstComboCandidate) == sizeOfCombo:
yield tuple(lstComboCandidate)
elif sizeOfList - idxItemOfList >= sizeOfCombo - len(lstComboCandidate):
lstComboCandidate.append(lstOfSortedItems[idxItemOfList])
yield from combinate(idxItemOfList + 1)
lstComboCandidate.pop()
yield from combinate(idxNextUnique(idxItemOfList))
yield from combinate(0)
I have some basic understanding of Python and C++ programming, but absolutely no idea how to "translate" Pythons yield into C++ code of an Python extension module. So my question is:
How to write C++ code (of a Python module) able to return a Python iterator object?
Any hints to get me started are welcome.
UPDATE (status 2017-05-07):
Both the comment: yield has no C++ equivalent. I'd start by implementing the iterator protocol manually in Python, to get out of the yield and yield from mindset. – user2357112 Apr 26 at 1:16 and the hint in the answer by danny the answer to this question is the same as asking 'How do I implement an iterator without using yield' but in a C++ extension instead of pure Python. put my programming efforts in the wrong direction of reinventing the wheel by rewriting the algorithms code in order to eliminate yield and by writing the C-code of a Python extension module from scratch (what resulted in raining Segmentation Fault errors).
The state-of-the-art of my current knowledge on the subject of the question is that using Cython it is possible to translate the above Python code (which is using
yield) directly into C code of a Python extension module.
This is not only possible using the Python code just as it is (without the need of rewriting anything), but in addition to that the speed of the extension module created by Cython from the algorithm using yield runs at least twice that fast as the extension module created from the to an iterator class using __iter__ and __next__ rewritten algorithm (the latter is valid if no Cython specific speed optimization code is added to the Python script).
This is more a response to your question edit than a complete answer - I agree with the gist of Danny's answer, that you need to implement this in a class with a
__next__/nextmethod (depending on the version of Python). In your edit you assert that it must be possible because Cython can do it. I thought it was worth having a look at exactly how Cython does it.Start with a basic example (picked because it has a few different
yieldstatements and a loop):The first thing Cython does is define a
__pyx_CoroutineObjectC class with a__Pyx_Generator_Nextmethod that implements__next__/next. A few relevant attributes of the__pyx_CoroutineObject:body- a C function pointer that implements the logic you've defined.resume_label- an integer used to remember how far you've got in the function defined bybodyclosure- a custom created C class that stores all the variables used withinbody.In a slightly roundabout way,
__Pyx_Generator_Nextcalls thebodyattribute, which is the translation of the Python code you've defined.Let's then look at how the function assigned to
bodyworks - in the case of my example called__pyx_gb_5iters_2generator. The first thing it does is useresume_labelto jump to the rightyieldstatement:Any variable assignments are done through the
closurestructure (locally named to__pyx_cur_scope:yieldsets theresume_labeland returns (withresume_labelallowing you to jump straight back next time):The loop is slightly more complicated but basically the same thing - it uses
gototo jump into the C loop (which is legal).Finally, once it's reached the end it raises a
StopIterationerror:In summary, Cython does exactly what you've been advised to do: it defines a class with
__next__ornextmethod and uses that class to keep track of the state. Because it's automated, it's quite good at keeping track of reference counting and thus avoids theSegmentation Faulterrors you experienced. The use ofgototo return to the previous point of execution is efficient, but requires care.I can see why rewriting your generator function in C in terms of a single
__next__/nextfunction is unappealing, and Cython clearly offers a straightforward way of doing it without writing C yourself, but it doesn't use any special techniques to do the translation on top of what you've already been told.