I want to take advantage of the fast data.table::fcase() instead of using dplyr::case_when(), but I don't know how to keep the value of a given row as the default (and not a fixed value)
Say you have
dt <- data.table(v1 = c("1","2", "3", NA))
Using dplyr's case_when()
dt %>%
mutate( v2 = case_when(is.na(v1) ~ "0",
TRUE ~ v1))
you get the expected result
> dt %>%
+ mutate( v2 = case_when(is.na(v1) ~ "0",
+ TRUE ~ v1))
v1 v2
1: 1 1
2: 2 2
3: 3 3
4: <NA> 0
Using data.table's fcase()
dt[ , v2 := fcase(is.na(v1), "0",
default = v1)]
(or anything similar to that code) you get the error
> dt[ , v2 := fcase(is.na(v1), "0",
+ default = v1)]
Error in fcase(is.na(v1), "0", default = v1) :
Length of 'default' must be 1.
I believe because v1 is viewed as the full column v1
How do I fix it?
The
data.table-equivalent ofdplyr::case_when(TRUE ~ expr)isfcase(rep(TRUE, .N), expr).But in this case, both
dplyranddata.table's "coalesce" function is a better fit for that logic: