Add leading zeros to character column in a Spark data frame using sparklyr

93 Views Asked by At

How can I add leading zeroes to data$var1 in the following MWE?

library(sparklyr)
data <- data.frame(
  var1 = c("ab", "abc", "abcd"),
  var2 = c(1, 2, 3)
)
data <- sdf_copy_to(sc, data, "data", overwrite = TRUE)

The data frame I would like to achieve (preferably by using dplyr::mutate) is:

# Source: spark<data> [?? x 2]
  var1   var2
  <chr> <dbl>
1 00ab      1
2 0abc      2
3 abcd      3 

I have already tried SparkR::lpad, stringr::str_pad, sprintf etc. but with no luck.

1

There are 1 best solutions below

0
Christoffer On

For whom it might be relevant, the solution I found was to use SQL directly instead of trying to rely on a sparklyr translation from R to SQL (which is what's happening under the hood of sparklyr, as far as I understand).

sparklyr::sdf_sql(sc, "SELECT LPAD(var1, 4, '0') AS var3 FROM data")