I'm trying to download some data and have the following
library(RJSONIO)
first.day <- as.Date(paste0("202201","01"), "%Y%m%d")
today <- Sys.Date()
months <- format(seq(first.day, today,by="month"), "%Y%m")
length(months)
months.df.list <- vector("list", length = length(months))
for (m in months){
print(m)
filename <- paste0("./monthly_bike_data/", m, ".csv")
query <- paste0("https://data.urbansharing.com/bergenbysykkel.no/trips/v1/", substr(m,0,4),"/", substr(m,5,6), ".json")
month.df.raw <- fromJSON(query) #retrieve data
month.df <- do.call(rbind,month.df.raw) # convert to a proper dataframe
write.csv(month.df, filename)
}
However, when I run class(month.df) I get
[1] "list"
. When I simply run the code
twentytwenty.raw <- fromJSON("https://data.urbansharing.com/bergenbysykkel.no/trips/v1/2020/01.json")
twentytwenty <- do.call(rbind, twentytwenty.raw)
class(twentytwenty)
outside of the loop, I get
[1] "matrix" "array"
As far as I can see, the only difference here is that do.call() is run inside vs. outside of a loop, and it seems it changes the output behavior.
Does anyone know why this is the case and a suggestion for a fix?
Assuming it's more about getting those files than loops and
do.call(), I'd suggest bit different approach. First few remarks:So I'd build a dataframes from S3 ListBucketResult XML and use those URLs to directly fetch CSV files. Sorry for tidverse overload ..
Created on 2023-07-01 with reprex v2.0.2