Why is python zip function builtin?

660 Views Asked by At

I'm aware Python's zip() function is a builtin function written in C but I'm confused as to why they have bothered.

If I'm not mistaken, in Python, zip() can be rewritten as:

def zip(seq1, seq2):
    out = []
    for i in range(len(seq1)):
        out.append((seq1[i], seq2[i]))

    return out

(assuming seq1 and seq2 are the same size. The actual zip() will use the length of the smallest sequence)

This is a very simple piece of code, so I'm wondering, why have they made zip() a builtin function? I can only assume writing it in C is to make it faster - if that is the case, does anybody know how much faster? (I'm aware this will depend on the size seq1 and seq2)

1

There are 1 best solutions below

3
On BEST ANSWER
In [148]: def zipped(seq1, seq2):
        out = []
        for i in range(len(seq1)):
                out.append((seq1[i], seq2[i]))
        return out
   .....:     

In [149]: %timeit zipped(range(10), range(11, 21))
100000 loops, best of 3: 3.06 µs per loop


In [152]: %timeit zip(range(10), range(11, 21))
1000000 loops, best of 3: 1.23 µs per loop

You can see that there is execution time difference(more than twice) in both zipped and inbuild zip function.

from itertools import izip
In [156]: %timeit list(izip(range(10), range(11, 21)))
100000 loops, best of 3: 1.98 µs per loop

While itertools.izip takes same about that of zip,

even for large data array:-

In [157]: %timeit zipped(range(10**5), range(10**5))
10 loops, best of 3: 77.3 ms per loop

In [158]: %timeit zip(range(10**5), range(10**5))
10 loops, best of 3: 31.5 ms per loop

In [159]: %timeit list(izip(range(10**5), range(10**5)))
10 loops, best of 3: 37.4 ms per loop