I'm trying to download World Bank's WITS data and convert them into R dataframe. The site's API (ref. pp. 12-13) does not seem to allow users to call all "reporters," "partners," and "products" at once, so one may need to loop through a list of country (either as "reporters" or "partners") using their XML request format:
http://wits.worldbank.org/API/V1/SDMX/V21/datasource/tradestats-tariff/reporter/usa/year/2000/partner/all/product/all/indicator/AHS-WGHTD-AVRG
(One will need to loop through the "usa" part by a list of "reporters" (country abbreviation)) My goal is to loop through a list a country abbreviation, generate a dataframe for each run, and then bind them together into a larger dataframe. So I referenced this post and use the following code, but it didn't get me any further. I posted my code at below and it will be really appreciated if someone could take a look at and share some tips on this.
# load required packages
library(RCurl)
library(XML)
devtools::install_github("opensdmx/rsdmx")
library(rsdmx)
# if just for one reporter (usa)
myUrl <- "http://wits.worldbank.org/API/V1/SDMX/V21/datasource/tradestats-tariff/reporter/usa/year/2000/partner/all/product/all/indicator/AHS-WGHTD-AVRG"
dataset <- readSDMX(myUrl)
stats <- as.data.frame(dataset)
dim(stats)
[1] 5142 8
# looks like this
head(stats)
FREQ REPORTER PARTNER PRODUCTCODE
1 A USA ABW 01-05_Animal
2 A USA ABW 06-15_Vegetable
3 A USA ABW 16-24_FoodProd
## loop through a list of reporters (countries)
library(rvest)
# teams
reporters <- c("aus", "usa", "ukr")
# init
df <- data.frame()
# loop
for(i in reporters){
# find url
myUrl <- paste0("http://wits.worldbank.org/API/V1/SDMX/V21/datasource/tradestats-tariff/reporter/", i,"/year/2000/partner/all/product/all/indicator/AHS-WGHTD-AVRG")
dataset <- readSDMX(myUrl)
stats <- as.data.frame(dataset)
# bind to dataframe
df <- rbind(df, stats)
}
# view captured data
View(df)
# NOTHING!
Consider R's built-in
utils::download.file()
and then parse withXML
. Because your data is attribute-centric with no element text in<Series>
and<Obs>
nodes, consider the undocumentedxmlAttrsToDataFrame
, requiring triple colon qualifier,:::
.Finally, use an apply function like
sapply
and avoid the bookkeeping offor
loops and inefficiently growing an object in a loop by callingrbind
iteratively on same dataframe. Below even wraps download and XML parsing intryCatch
for potential errors likeukr
.Output
df_list
final_df