Expand rows in data frame by length of sequence

727 Views Asked by At

I have a data frame like this

   mydf <- data.frame(x = c("a", "b", "q"), 
                      y = c("c", "d", "r"), 
                      min = c(2, 5, 3), 
                      max = c(4,6,7))

        x y min max
        a c   2   4
        b d   5   6
        q r   3   7

and I would like to expand it to this using a simple function

   mydf_expanded <- data.frame(x = c(rep("a",3), rep("b",2), rep("q",5)), 
                               y = c(rep("c",3), rep("d",2), rep("r",5)), 
                               min_to_max = c(2:4,5:6,3:7))

       x y min_to_max
       a c          2
       a c          3
       a c          4
       b d          5
       b d          6
       q r          3
       q r          4
       q r          5
       q r          6
       q r          7

Which is basically trying to expand mydf by length(mydf$min:mydf:max) of each row of mydf. But I also want to add that min_to_max column which is the sequence mydf$min:mydf:max within each row.

I know about this function which can do an expansion based on a "count" you specify

       library(splitstackshape)
       mydf_expand = expandRows(mydf, "count")

But this doesn't exactly get me to my solution. Please help! Thanks!

1

There are 1 best solutions below

0
On BEST ANSWER
df <- data.frame(x = c("a", "b", "q"), 
                   y = c("c", "d", "r"), 
                   min = c(2, 5, 3), 
                   max = c(4,6,7))
library(tidyverse)

df %>% 
  rowwise() %>% 
  transmute(x, y, all = list(seq(min, max))) %>% 
  unnest_longer(all)
#> # A tibble: 10 x 3
#>    x     y       all
#>    <chr> <chr> <int>
#>  1 a     c         2
#>  2 a     c         3
#>  3 a     c         4
#>  4 b     d         5
#>  5 b     d         6
#>  6 q     r         3
#>  7 q     r         4
#>  8 q     r         5
#>  9 q     r         6
#> 10 q     r         7

Created on 2021-03-11 by the reprex package (v1.0.0)

data.table

library(data.table)
setDT(df)

df$all <- apply(df[, c(3,4)], 1, function(x) seq(x[1], x[2]))

df[, list(all = unlist(all)), by = list(x, y)]
#>     x y all
#>  1: a c   2
#>  2: a c   3
#>  3: a c   4
#>  4: b d   5
#>  5: b d   6
#>  6: q r   3
#>  7: q r   4
#>  8: q r   5
#>  9: q r   6
#> 10: q r   7

Created on 2021-03-11 by the reprex package (v1.0.0)