I have an m x 3 matrix A and its row subset B (n x 3). Both are sets of indices into another, large 4D matrix; their data type is dtype('int64'). I would like to generate a boolean vector x, where x[i] = True if B does not contain row A[i,:].
There are no duplicate rows in either A or B.
I was wondering if there's an efficient way how to do this in Numpy? I found an answer that's somewhat related: https://stackoverflow.com/a/11903368/265289; however, it returns the actual rows (not a boolean vector).
You could follow the same pattern as shown in jterrace's answer, except use
np.in1dinstead ofnp.setdiff1d:yields
You can use
assume_unique=True(which can speed up the calculation) since there are no duplicate rows inAorB.Beware that
A.view(...)will raiseif
A.flags['C_CONTIGUOUS']isFalse(i.e. ifAis not a C-contiguous array). Therefore, in general we need to usenp.ascontiguous(A)before callingview.As B.M. suggests, you could instead view each row using the "void" dtype:
This is safe to use with integer dtypes. However, note that
so using
np.in1dafter viewing as void may return incorrect results for arrays with float dtype.Here is a benchmark of some of the proposed methods: