CSV to disk frame with multiple CSVs

208 Views Asked by At

I'm getting this error when trying to import CSVs using this code:

some.df = csv_to_disk.frame(list.files("some/path"))

Error in split_every_nlines(name_in = normalizePath(file, mustWork = TRUE), : Expecting a single string value: [type=character; extent=3].

I got a temporary solution with a for loop that iterated through each of the files and then I rbinded all the disk frames together.

I pulled the code from the ingesting data doc

1

There are 1 best solutions below

2
xiaodai On BEST ANSWER

This seems to be an error triggered by the bigreadr package. I wonder if you have a way to reproduce the chunks.

Or maybe try a different chunk reader,

csv_to_disk.frame(..., chunk_reader ="data.table") 

Also, if all fails (since CSV reading is hard), reading them in a loop then append would work as well.

Perhaps you need to specify to only read CSVs? like

list.files("some/path", pattern=".csv", full.names=TRUE)

Otherwise, it normally works,

library(disk.frame)

tmp = tempdir()

sapply(1:10, function(x) {
  data.table::fwrite(nycflights13::flights, file.path(tmp, sprintf("tmp%s.csv", x)))
})


library(disk.frame)
setup_disk.frame()
some.df = csv_to_disk.frame(list.files(tmp, pattern = "*.csv", full.names = TRUE))