I have been trying to write a code that will read selected sheets in an .xls file using import_list (rio package), then rbinding and converting them to a .csv file. So far, I have been able to make the code read either all the sheets in the xls file at the same time, sequential sheets (eg 1:4) but not selected sheets based on their names. Please help. I have attached the current iteration of my code and a part of my data.
library(rio)
d <- import_list("allTP.xls", which = (("201901,Xizhi,PM10")), setclass="data.table",range = "A2:Y33", rbind = T, col_names=T, rbind_label = "_file", rbind_fill = TRUE)
The
which
argument has to be a vector of sheet names. That is, it needs to be in this format, using thec
function:or whatever sheets you want to import.
The
which
argument doesn't appear to support regular expression, so importing sheets using wildcards is not possible. But you can import all sheets and then use the_file
column to select the data to individually export (if you know which sheet number is which since the sheet names are not saved, only the number).In your data, you have all the information in the first row, but you don't import this because you specify the range as "A2:Y33". If you select "A1:Y33" then you'll get "characters" for all columns, and you don't want that since the column names start in row 2.
One solution is to import all sheets, range="A2:Y33", as you did, and then repeat for range "A1:Y2", saving the result as
d2
. This range contains the information you need to select. You can then merged
withd2
on_file
and export individual csv files based on the information ind2
.For
d2
we import only range "A1:Y1", then subset columns 1, 7, 13 and 26, and set the names appropriately.Then merge the two datasets on the
_file
column.Check the data.
So there's eight sheets each containing 31 records. But we have to clean these names because if we use these for the filenames, R will complain.
Now split on the interaction of all three contextual variables
And save to individual CSV files.
You can then easily choose to export only the data you want using a subset of
f
ord3
.