Performing a semi-anti-join (in binary search)

125 Views Asked by At

I'd like to subset a data.table by choosing the first key and excluding the second key.

set.seed(18032)
DT <- data.table(grp1 = sample(10, 1000, T),
                 grp2 = sample(10, 1000, T),
                 v = rnorm(100), key = "grp1,grp2")

My first instinct didn't work (! operated too early):

DT[.(10, !10)] #!10 = 0, chooses the (10,0) subset

This seems too inelegant, but works:

DT[.(10, setdiff(unique(grp2), 10))] #unique(grp2) %\% 10 for the bold ;-)

And this also works, but this approach sacrifices some functionality (e.g., access to := on DT):

setkey(DT, grp2, grp1)
DT[!.(10)][CJ(grp2, 10, unique = TRUE)]
#equivalently
DT[!.(10)][.(unique(grp2), 10)]

Have I exhausted my options, or am I missing something?

1

There are 1 best solutions below

8
On

This seems to do what I expected:

DT[ grp1==10 & grp2 != 10, ]

It seems to allow targeted assignment if you use := in the j -position.

As an example, this should succeed (with no loss of efficiency):

 DT[ grp1==10 & grp2 != 10, v := 0 ]