R vctrs subclassing with unicity constraint

39 Views Asked by At

In R, I need to define a subclass of vctrs with an unicity constraint. I can propose here a minimal code for such new uid vector type.

library(vctrs)

new_uid <- function(x) {
  if (anyDuplicated(x) > 0) stop("Duplicated id")
  new_vctr(x,class="uid")
}

uid <- function(length=0) {
  new_uid(0:(length-1))
}

vec_type2.uid.uid <- function(x,y, ...)  uid()
vec_type2.uid.uid <- function(x, to, ...) x

I can create vectors of that type

x <- new_uid(1:10)
y <- new_uid(11:20)

and if I have duplicated values an error is produced

z <- new_uid(c(1:10,2))
Error in new_uid(c(1:10, 2)) : Duplicated id

The concatenation works and produces a longer uid vector

vec_c(x,y)
<uid[20]>
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

But if it creates duplicates, the constraint is not asserted and therefore, no error is raised.

vec_c(x,x)
<uid[20]>
 [1]  1  2  3  4  5  6  7  8  9 10  1  2  3  4  5  6  7  8  9 10

As proposed by @Rui-Barradas base::c can be specialized,

c.uid <- function(...){
  x <- unlist(list(...))
  #x <- unique(x)
  new_uid(x)
}

c(x, x)
# Error in new_uid(x) : Duplicated id 

But this does not impact the vctrs::vec_c() function used by tibble...

Can I define something to force the assertion of the unicity constraint after concatenation using vec_c ?

All the best,

2

There are 2 best solutions below

0
On

After looking again into the vctrs vignette S3 vectors, I tested a solution that works.

It requires to define a vec_restore.uid method that will assert the unicity, before passing to the NextMethod usually the default one :

vec_restore.uid <- function(x, to, ...) {
  if (anyDuplicated(na.omit(x)) > 0)
    stop("Duplicated id")
  NextMethod()
}

after that we get the correct behaviour

vec_c(x,x)
Error in vec_restore.uid(x = x, to = to, n = n) : Duplicated id

and the same with

c(x,x)
Error in vec_restore.uid(x = x, to = to, n = n) : Duplicated id

while

c(x,y)
<uid[20]>
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

continues to provide the correct answer.

Thank you for your help.

1
On

I am not sure that the following is what you want, I have (very) little experience with vctrs classes.

c.uid <- function(...){
  x <- unlist(list(...))
  #x <- unique(x)
  new_uid(x)
}

c(x, x)
# Error in new_uid(x) : Duplicated id