I try to use numpy to fast text analysis. Exactly to collocations analysis. Let's suppose I have the following string which I converted into a numpy array:
text = np.array(['a', 'b', 'c', 'd', 'e', 'b', 'f', 'g'])
Suppose I want to take from that array the left and right contexts of the letter 'b'. Let's say 1 element to the left and 2 elements on the right. So I want to have something like that:
['a', 'c', 'd'] + ['e', 'f', 'g']
Is it possible to do it with Numpy broadcasting all the operations? I did just looping on the text, but it's very time consuming.
I tried np.select, np.where and np.mask
Thanks for your help :)
One possible way is to find
b
value indices (withnp.where(arr == 'b')
) to further index the adjacent values: