Depending on your few on my approach this is either a question about using np.unique()
on awkward1
arrays or a call for a better approach:
Let a
and b
be two awkward1
arrays of the same outer length (number of events) but different inner lengths. For example:
a = [[1, 2], [3] , [] , [4, 5, 6]]
b = [[7] , [3, 5], [6], [8, 9]]
Let f: (x, y) -> z
be a function that acts on two numbers x
and y
and results in the number z
. For example:
f(x, y):= y - x
The idea is to compare every element in a
with every element in b
via f
for each event and filter out the matches of a
and b
pairs that survive some cut applied to f
. For example:
f(x, y) < 4
My approach for this is:
a = ak.from_iter(a)
b = ak.from_iter(b)
c = ak.cartesian({'x':a, 'y':b})
#c= [[{'x': 1, 'y': 7}, {'x': 2, 'y': 7}], [{'x': 3, 'y': 3}, {'x': 3, 'y': 5}], [], [{'x': 4, 'y': 8}, {'x': 4, 'y': 9}, {'x': 5, 'y': 8}, {'x': 5, 'y': 9}, {'x': 6, 'y': 8}, {'x': 6, 'y': 9}]]
i = ak.argcartesian({'x':a, 'y':b})
#i= [[{'x': 0, 'y': 0}, {'x': 1, 'y': 0}], [{'x': 0, 'y': 0}, {'x': 0, 'y': 1}], [], [{'x': 0, 'y': 0}, {'x': 0, 'y': 1}, {'x': 1, 'y': 0}, {'x': 1, 'y': 1}, {'x': 2, 'y': 0}, {'x': 2, 'y': 1}]]
diff = c['y'] - c['x']
#diff= [[6, 5], [0, 2], [], [4, 5, 3, 4, 2, 3]]
cut = diff < 4
#cut= [[False, False], [True, True], [], [False, False, True, False, True, True]]
new = c[cut]
#new= [[], [{'x': 3, 'y': 3}, {'x': 3, 'y': 5}], [], [{'x': 5, 'y': 8}, {'x': 6, 'y': 8}, {'x': 6, 'y': 9}]]
new_i = i[cut]
#new_i= [[], [{'x': 0, 'y': 0}, {'x': 0, 'y': 1}], [], [{'x': 1, 'y': 0}, {'x': 2, 'y': 0}, {'x': 2, 'y': 1}]]
It is possible that pairs with the same element from a
but different elements from b
survive the cut. (e.g. {'x': 3, 'y': 3}
and {'x': 3, 'y': 5}
)
My goal is to group those pairs with the same element from a
together and therefore reshape the new
array into:
new = [[], [{'x': 3, 'y': [3, 5]}], [], [{'x': 5, 'y': 8}, {'x': 6, 'y': [8, 9]}]]
My only idea how to achieve this is to create a list of the indexes from a
that are still present after the cut by using new_i
:
i = new_i['x']
#i= [[], [0, 0], [], [1, 2, 2]]
However, I need a unique version of this list to make every index appear only once. This could be achieved with np.unique()
in NumPy. But doesn't work in awkward1
:
np.unique(i)
<__array_function__ internals> in unique(*args, **kwargs)
TypeError: no implementation found for 'numpy.unique' on types that implement __array_function__: [<class 'awkward1.highlevel.Array'>]
My question:
Is their a np.unique()
equivalent in awkward1
and/or would you recommend a different approach to my problem?
Okay, I still don't know how to use
np.unique()
on my arrays, but I found a solution for my own problem:In my previous approach I used the following code to pair up booth arrays.
However, with the
nested = True
parameter fromak.cartesian()
I get a list grouped by the elements ofa
:After the cut I end up with:
I extract the
y
values and reduce the most inner layer of the nested lists ofnew
to only one element:(I tried to use
ak.firsts()
withaxis = -1
but it seems to be not implemented yet.)Now every most inner entry in
new
belongs to exactly one element froma
. By replacing the currenty
ofnew
with the previously extractedy
I end up with my desired result:Anyway, should you know a better solution, I'd be pleased to hear it.