Assuming I have a data frame df
> dput(df)
structure(list(x = c("X", "X", "X", "Y", "Y", "Z", "Z", "Z"),
y = c("A", "B", "C", "B", "C", "A", "C", "D")), class = "data.frame", row.names = c(NA,
-8L))
> df
x y
1 X A
2 X B
3 X C
4 Y B
5 Y C
6 Z A
7 Z C
8 Z D
and generate a list u1 like below
u1 <- with(
df,
tapply(y, x, combn, 2, toString)
)
where
> u1
$X
[1] "A, B" "A, C" "B, C"
$Y
[1] "B, C"
$Z
[1] "A, C" "A, D" "C, D"
> str(u1)
List of 3
$ X: chr [1:3(1d)] "A, B" "A, C" "B, C"
$ Y: chr [1(1d)] "B, C"
$ Z: chr [1:3(1d)] "A, C" "A, D" "C, D"
- attr(*, "dim")= int 3
- attr(*, "dimnames")=List of 1
..$ : chr [1:3] "X" "Y" "Z"
When I ran stack(u1), I will have the following error
> stack(u1)
Error in stack.default(u1) : at least one vector element is required
It seems that I cannot use stack over the output of tapply directly even if it is a named list.
However, when I use u2 <- Map(c,u1) for postprocessing, then things get working again
> u2 <- Map(c, u1)
> u2
$X
[1] "A, B" "A, C" "B, C"
$Y
[1] "B, C"
$Z
[1] "A, C" "A, D" "C, D"
> str(u2)
List of 3
$ X: chr [1:3] "A, B" "A, C" "B, C"
$ Y: chr "B, C"
$ Z: chr [1:3] "A, C" "A, D" "C, D"
> stack(u2)
values ind
1 A, B X
2 A, C X
3 B, C X
4 B, C Y
5 A, C Z
6 A, D Z
7 C, D Z
As we can see, in str(u2), the attributes are filtered out, which seems solving the issue.
My question is:
Why u1 failed but u2 succeeded? Is there any other way I can use tapply over u1 without any postprocessing (like Map(c, u1))?
tapplyreturns anarray(or alistif you setsimplify = FALSE), andstackdoesn't like an array input. Thetapplydocumentation doesn't sound like there are other output options. From?tapply(emphasis mine):So I'd recommend casting to character:
If you're concerned about speed, you could run benchmarks to see, removing the
dimattribute might be faster thanas.character(),