I am trying to vectorize a partial function which takes two arguments, both of them lists, then does something to the pairwise elements from the lists (using zip). However, I am finding some unexpected behaviour.
Consider the following code:
import functools
import numpy as np
def f(l1,l2):
l1 = l1 if isinstance(l1,list) or isinstance(l1,np.ndarray) else [l1]
l2 = l2 if isinstance(l2,list) or isinstance(l2,np.ndarray) else [l2]
for e1,e2 in zip(l1,l2):
print(e1,e2)
f(['a','b'],[1,2])
fp = functools.partial(f,l1=['a','b'])
fp(l2=[1,2])
fv = np.vectorize(fp)
fv(l2=np.array([1,2]))
The output from the Jupyter notebook is as follows:
a 1
b 2
a 1
b 2
a 1
a 1
a 2
array([None, None], dtype=object)
I have two questions:
- First, the type check at the beginning of
fis necessary becausenp.vectorizeseems to automatically fully flatten any input (I get aint32 not iterableexception otherwise). Is there a way to avoid this? - Secondly, when the partial function
fpis vectorized, clearly the output is not the expected one - I am not sure I understand what NumPy is doing here, including the final empty array output. No matter how much I nest[1,2]within a list, tuple or array the output seems to be always the same. How can I fix my code so that the vectorized functionfvbehave as expected - that is the same asfp?
Edit
Another try I have done is:
fpv(l2=[np.array([1,2]), np.array([3,4])])
whose output is:
a 1
a 1
a 2
a 3
a 4
After the changes to
isinstanceI analyzed further:Output:
As expected. Two lists [a, b] and [1, 2] zipped together.
Same as above, functools.partial just wraps the function with two args into a function with one arg injected by functools and one exposed. Same input, same output.
This is what I would expect vectorize to do: map the function
fpover the members of the inputl2.So I would expect the underlying function calls to
f():This is almost what we see happen, except the first call is reapeated twice.
Minimal reproduction scenario:
Prints 4 lines: 1, 1, 2, 3.
So the unexpected behaviour is in the ndarray class with
np.vectorize; it seems to add a header to the array which is processed like an element.This problem was also addressed in Why does numpy's vectorize function perform twice on the first element.
Here's the fix:
np.vectorize(fp, otypes=['str'])(l2=np.array([1, 2, 3]))Specifying the otypes will eliminate the extra calculation over the first element in the vector.