I'm trying to import multiple CSV files in the RStudio while keeping their filenames.
library(readr)
library(dplyr)
library(purrr)
#importing all csv files at once
csv_files = list.files(pattern ="*Con.csv")
myfiles = lapply(csv_files , read.delim, header = TRUE, sep = "," )
#merging all files by identifiers
Samp_merg <- myfiles %>% reduce(full_join, by=c("chr", "start","end"))
After doing this I could import the files but the names of the files were missing from the list myfiles.
myfiles <- dir(pattern = "*Con.csv", full.names = FALSE)
myfiles_data <- lapply(myfiles, data.table::fread)
# assign names to list items
names(myfiles_data) <- myfiles
#merging the files
dat_merg <- myfiles_data %>% reduce(full_join, by=c("chr", "start", "end"))
Here, using this script I can import the files by keeping their names in the myfiles_data object. However, after joining by three identifiers I'm unable to retain their file names as column names. I want to keep the colname of the merged df as the individual file name without extension (.csv).
There are around 90 CSV files present in the directory with the same header.
$ls
01AvPMPpCon.csv
02AvPMPpCon.csv
03AvPMPpCon.csv
04AvPMPpCon.csv
05AvPMPpCon.csv
$head 01AvPMPpCon.csv
chr,start,end,CpG
chr1,2017424,2017750,10
chr1,24901325,24901700,11
chr1,24902268,24902701,25
chr1,24927215,24927416,4
chr1,26861926,26862173,5
chr1,26864186,26864613,15
chr1,35576334,35576451,3
chr1,36304606,36304817,7
At now, the merged file looks like this,
$head(dat_merg)
chr start end CpG.x CpG.y CpG.x.x CpG.y.y CpG.x.x.x CpG.y.y.y
1: chr1 3903250 3903277 4 NA NA NA 4 NA
2: chr1 4657240 4657314 3 NA NA NA NA NA
3: chr1 24900249 24900468 5 NA 5 NA NA NA
4: chr1 46484938 46485047 4 NA 4 NA NA NA
5: chr1 47223634 47223758 4 NA NA NA 4 4
6: chr1 66752822 66753167 12 12 NA NA 12 NA
So, my expected output should look like,
$head(dat_merg)
chr start end 01Av 02Av 03Av 04Av 05Av 06Av
1: chr1 3903250 3903277 4 NA NA NA 4 NA
2: chr1 4657240 4657314 3 NA NA NA NA NA
3: chr1 24900249 24900468 5 NA 5 NA NA NA
4: chr1 46484938 46485047 4 NA 4 NA NA NA
5: chr1 47223634 47223758 4 NA NA NA 4 4
6: chr1 66752822 66753167 12 12 NA NA 12 NA
What about
pivot_wider()instead ofreduce(full_join, ...)?Prepare reprex, 99 4-row csv files:
read_csv()can read from a list of files and stores file names in id column,c(chr, start, end)will be used forpivot_wider()id_cols:Result:
Created on 2024-01-18 with reprex v2.0.2