I'm trying to get spreadsheet data from zipped .xlsx files. I'm using rubyzip
to access the contents of the zipfile
Zip::File.open(file_path) do |zip_file|
zip_file.each do |entry|
*process entry*
end
end
My problem is that rubyzip gives a Zip::Entry
object, which, I cant get to work with gems like roo
or creek
.
I've done something similar, but with .csv file. This was as simple as CSV.parse(entry.get_input_stream.read)
. However, that just gives me a string of encoded gibberish when using it on an .xlsx file.
I've looked around and the closest answer I got was temporarily extracting the files, but I want to avoid doing this since the files can get pretty large.
Does anyone have any suggestions? Thanks in advance.
So what you need to do is convert the stream into an
IO
object thatRoo
can understand.To determine if the object passed to
Roo::Spreadsheet.open
is a "stream"Roo
uses the following method:Since a
Zip::InputStream
does not respond toseek
you cannot use this object directly. To get around this we simply need an object that does respond toseek
(like aStringIO
)We can just
read
the input stream into theStringIO
directly:Or the
Zip
library also provides a method to copy aZip::InputStream
to anotherIO
object through theIOExtras
module, which I think reads fairly nicely as well.Knowing all of the above we can implement as follows: