Applying dtplyr directly to a data.table instead of a lazy_dt

247 Views Asked by At

I want to perform several operations intertwining dtplyr and data.table code. My question is whether, having loaded dtplyr, I can apply dplyr verbs to a data.table object and get optimized data.table code as I would with a lazy_dt.

I here provide some examples and ask: would dtplyr translate to data.table code here? Or is simply dplyr working?

# Setup for all chunks:
library(dplyr)
library(data.table)
library(dtplyr)

a) setDT

dataframe # class data.frame
setDT(dataframe)

dataframe %>% 
  group_by(id) %>% 
  mutate(rows_per_group = n())

b) data.table object

dt <- as.data.table(dataframe) # or dt <- data.table::fread(filepath)
dt %>%
  group_by(id) %>% 
  mutate(rows_per_group = n())

Also, if all of them make dtplyr work. What is the most efficient option between a), b) and c) using lazy_dt(dataframe)?

0

There are 0 best solutions below