R - how to get a multi-level list using list.files

33 Views Asked by At

I have a folder structure like this:

- ConditionA
     - Subcondition1
          - data1.Rds
          - data2.Rds
     - Subcondition2
          - data1.Rds
          - data2.Rds
- ConditionB
     - Subcondition1
          - data1.Rds
          - data2.Rds
     - Subcondition2
          - data1.Rds
          - data2.Rds

Using the list.files(recursive = T, full.names = T) gives the following:

"./ConditionA/Subcondition1/data1.Rds"
"./ConditionA/Subcondition1/data2.Rds"
"./ConditionA/Subcondition2/data1.Rds"
"./ConditionA/Subcondition2/data2.Rds"
"./ConditionB/Subcondition1/data1.Rds"
"./ConditionB/Subcondition1/data2.Rds"
"./ConditionB/Subcondition2/data1.Rds"
"./ConditionB/Subcondition2/data2.Rds"

However, what I want instead is a list of lists representing the nested folder structure. The list should be identical to this one I will construct manually here:

sublist1 <- list("data1.Rds", "data2.Rds")
sublist2 <- list("data1.Rds", "data2.Rds")
sublist3 <- list("data1.Rds", "data2.Rds")
sublist4 <- list("data1.Rds", "data2.Rds")

sublist5 <- list(sublist1, sublist2)
names(sublist5) <- c("Condition1", "Condition2")

sublist6 <- list(sublist3, sublist4)
names(sublist6) <- c("Condition1", "Condition2")

final_list <- list(sublist5, sublist6)
names(final_list) <- c("ConditionA", "ConditionB")

Let's see:

final_list

Gives the output:

$ConditionA
$ConditionA$Condition1
$ConditionA$Condition1[[1]]
[1] "data1.Rds"

$ConditionA$Condition1[[2]]
[1] "data2.Rds"


$ConditionA$Condition2
$ConditionA$Condition2[[1]]
[1] "data1.Rds"

$ConditionA$Condition2[[2]]
[1] "data2.Rds"



$ConditionB
$ConditionB$Condition1
$ConditionB$Condition1[[1]]
[1] "data1.Rds"

$ConditionB$Condition1[[2]]
[1] "data2.Rds"


$ConditionB$Condition2
$ConditionB$Condition2[[1]]
[1] "data1.Rds"

$ConditionB$Condition2[[2]]
[1] "data2.Rds"

How can I achieve this to be automated instead of constructing the list manually?

1

There are 1 best solutions below

1
r2evans On BEST ANSWER

A fun exercise is to do this with a recursive function.

fun <- function(L) {
  len1 <- lengths(L) == 1
  c(
    L[len1],
    if (any(!len1)) lapply(
      split(lapply(L[!len1], `[`, -1), sapply(L[!len1], `[[`, 1)),
      fun)
  )
}

Using a similar tree hierarchy:

list.files(recursive = TRUE, full.names = TRUE)
# [1] "./ConditionA/Subcondition1/data1.Rds" "./ConditionA/Subcondition1/data2.Rds" "./ConditionA/Subcondition2/data1.Rds"
# [4] "./ConditionA/Subcondition2/data2.Rds" "./ConditionB/Subcondition1/data1.Rds" "./ConditionB/Subcondition1/data2.Rds"
# [7] "./ConditionB/Subcondition2/data1.Rds" "./ConditionB/Subcondition2/data2.Rds"

We can do this:

list.files(recursive = TRUE, full.names = TRUE) |>
  sub("^\\./", "", x = _) |>
  # optional? just stripping the leading "./"
  strsplit("/") |>
  fun() |>
  str()
# List of 2
#  $ ConditionA:List of 2
#   ..$ Subcondition1:List of 2
#   .. ..$ : chr "data1.Rds"
#   .. ..$ : chr "data2.Rds"
#   ..$ Subcondition2:List of 2
#   .. ..$ : chr "data1.Rds"
#   .. ..$ : chr "data2.Rds"
#  $ ConditionB:List of 2
#   ..$ Subcondition1:List of 2
#   .. ..$ : chr "data1.Rds"
#   .. ..$ : chr "data2.Rds"
#   ..$ Subcondition2:List of 2
#   .. ..$ : chr "data1.Rds"
#   .. ..$ : chr "data2.Rds"

(The same output, but compactified with str() for presentation here.)