I am trying to import 9000 .csv files into R to create one master file and would like to be able to do them much more efficiently than
read.csv(file="filename',header=TRUE, sep="\t")
Furthermore I want to skip the first 7 lines in each .csv as they contain information about the .csv file but not before i retrieve information from those lines and add them as new columns in the data file so that i can identify each subsequent file later on.
ive used the skip=7
option when importing individual .csv's before with no issue but I haven't been able to import multiple files at once let alone with taking some information from those first 7 lines first.
I've also tried reading in many .csv files from the one folder using the following code
temp = list.files(pattern="*.csv")
myfiles = lapply(temp, read.delim)
every .csv takes the following format
Program 5.5.3 "rawFileName=""C:\....""" From=0:00.0, To=3:32:13.7 Date=24May2014 Athlete=John Smith EventDescription=Round 10 v Team B Time Var1 Var2 Var3 Var4 Var5 0:00 0 0 0 0 0 0:01 1 1 4 0 0
and i want my code to make them look like this
Time Var1 Var2 Var3 Var4 Var5 From To Date Athlete Event Description
0:00.0 0 0 0 0 0 0:00.0 3:32:13.7 24May2014 John Smith Round 10 v Team B
0:00.1 1 1 4 0 0 0:00.0 3:32:13.7 24May2014 John Smith Round 10 v Team B
The next athlete would be added below folowing the same format and so on
Has anyone else had a similar thing they've wanted to achieve and if so how did you do it?
you want to manually extract the first 7 lines and leave the rest for
read.delim
. you can do that by usingtextConnection
which allows you to pass strings to functions likeread.table
.then parse the metaData as you would do normally. I would put all these in a function that outputs a table and the metadata in a list. Through that you can have a list of lists that you can merge after.