Subset columns in df using for loop in R

418 Views Asked by At

I would like to subset a dataset into several smaller df's while using the names of columns in a for-loop.

Create some data (this can probably be done easier...)

Line1 <- c(1:7)
Line2 <- c(3:9)
df <- rbind(Line1, Line2)
ColumnNames <- c(paste0("Var", 1:7))
df <- lapply(df, setNames, ColumnNames)
colnames(df)=ColumnNames

As output I would like to have three dataframes: Df_Sub1, Df_Sub2 and Df_Sub3.

Sub1 <- c("Var1", "Var2", "Var3")
Sub2 <- c("Var3", "Var4")
Sub3 <- c("Var5", "Var6", "Var7")

Subs <- c("Sub1", "Sub2", "Sub3")

For loop to create the three subsets

for (Sub in Subs) { 
  name <- as.name(paste0("df_",Sub))
  df_ <- df[,colnames(df) %in% get(Sub)]
}
 

How to stick name after df_ for each of the three Subs (or do something else to make this work)?

2

There are 2 best solutions below

0
On

We can use mget to return the object values in a list, loop over the list, select the columns of 'df', and create the objects in global env with list2env

list2env(lapply(setNames(mget(Subs), paste0("Df_", Subs)),
        function(x) df[,x]), .GlobalEnv)

-output

Df_Sub1
#      Var1 Var2 Var3
#Line1    1    2    3
#Line2    3    4    5
Df_Sub2
#      Var3 Var4
#Line1    3    4
#Line2    5    6
Df_Sub3
#      Var5 Var6 Var7
#Line1    5    6    7
#Line2    7    8    9
0
On

The loop option would imply storing the vectors in lists:

#Data
Line1 <- c(1:7)
Line2 <- c(3:9)
df <- rbind(Line1, Line2)
ColumnNames <- c(paste0("Var", 1:7))
colnames(df)=ColumnNames
#Data 2
Sub1 <- c("Var1", "Var2", "Var3")
Sub2 <- c("Var3", "Var4")
Sub3 <- c("Var5", "Var6", "Var7")
Subs <- c("Sub1", "Sub2", "Sub3")
#Store in a list
List <- list(Sub1=Sub1,Sub2=Sub2,Sub3=Sub3)
#List for data
List2 <- list()
#Loop
for(i in Subs)
{
  List2[[i]] <- df[,List[[i]],drop=F]
}
#Format names
names(List2) <- paste0('df_',names(List2))
#Set to envir
list2env(List2,envir = .GlobalEnv)

Output:

df_Sub1
      Var1 Var2 Var3
Line1    1    2    3
Line2    3    4    5

df_Sub2
      Var3 Var4
Line1    3    4
Line2    5    6

df_Sub3
      Var5 Var6 Var7
Line1    5    6    7
Line2    7    8    9