I'm having some troubles trying to download bulk data from Eurostat, hope that you can help me out. I based my code from this post.
library(devtools)
require(devtools)
install_github("rsdmx", "opensdmx")
require(rsdmx)
# Make a temporary file (tf) and a temporary folder (tdir)
tf <- tempfile(tmpdir = tdir <- tempdir())
## Download the zip file
download.file("http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=data%2Frd_e_gerdsc.sdmx.zip", tf)
## Unzip it in the temp folder
test <- unzip(tf, exdir = tdir)
sdmx <- readSDMX(test)
stats <- as.data.frame(sdmx)
head(stats)
I'm receiving this warning, and the dataframe is empty:
Warning message:
In if (attr(regexpr("<!DOCTYPE html>", content), "match.length") == :
the condition has length > 1 and only the first element will be used
in EUROSTAT, the result of an extraction is made of two separate
XML
files:DSD
(data structure definition), which describes the SDMX datasetBased on your code, try this:
Note: calling
as.data.frame
might take some time to complete, depending on the size of the dataset. I have been performing more tests in order to further improve the performance of reading large SDMX datasets.Your use case is very interesting, i will add it to the rsdmx documentation as it shows how to use both Eurostat Bulk download service and rsdmx.
Hope this helps!