How to subset by common colnames from list of dataframes?

47 Views Asked by At

I'm wondering how can I subset dataframes by their column names from a list of dataframe?

For example I have MyDfList, a list of 3 dataframes with 4 columns, and I just want to keep only two columns (A and D) from all the dataframe in the list.

MyDfList <- list(data.frame(A = "S1", B = "2208", C ="2399", D="1.086504"),
                data.frame(A = "S2", B = "6756", C ="6970", D="1.031676"),
                data.frame(A = "S3", B = "8271", C ="8401", D="1.015718"))

I was trying but couldn't get through.

out0<-lapply(MyDfList , function(x) c("A","D") %in% colnames(x))

out1 <- Filter(function(x) c("A","D") %in% names(x), MyDfList )

out2<-MyDfList[sapply(MyDfList , function(x) c("A","D") %in% colnames(x))]
3

There are 3 best solutions below

1
TarJae On

base R:

Here we subset each dataframe by passing the anonymous function df[, c("A", "D")] to each data frame with lapply().

lapply(MyDfList, function(df) df[, c("A", "D")])

#As of R 4.1.0, it is possible to natively use \(x) {} instead of function(x) {} for anonymous function.
lapply(MyDfList, \(x) x[, c("A", "D")])

tidyverse:

Here we pass the anonymous function ~select(., A, D) with map to each dataframe.

library(purrr)
library(dplyr)

 MyDfList %>% 
  map(~select(., A, D))

[[1]]
   A        D
1 S1 1.086504

[[2]]
   A        D
1 S2 1.031676

[[3]]
   A        D
1 S3 1.015718

0
Onyambu On

in base R:

lapply(MyDfList, '[', c('A', 'D'))
[[1]]
   A        D
1 S1 1.086504

[[2]]
   A        D
1 S2 1.031676

[[3]]
   A        D
1 S3 1.015718
0
Parfait On

Consider literally subset in base R, a useful method for subsetting by rows and/or columns:

lapply(MyDfList, subset, select=c(A, D))
# [[1]]
#    A        D
# 1 S1 1.086504
#
# [[2]]
#    A        D
# 1 S2 1.031676
# 
# [[3]]
#    A        D
# 1 S3 1.015718