Convert object from JSON to dataframe quickly

83 Views Asked by At

I have an object which is available through a URL in JSON format, and part of it I want to put into a dataframe format to analyse the data in R. I currently do this as follows:

# Read data in using fromJSON function
data <- RJSONIO::fromJSON('https://api-prod.footballindex.co.uk/football.allTradable24hrchanges?page=1&per_page=5000&sort=asc')

# The above is an example dataset to use for this question
# In my actual dataset there are fields which only exist in some elements in list so adding this onto example
# I want to handle these by returning in the end dataframe with NA if it doesn't exist
data[['items']][[1]]$newField <- 1

# It is only the data in the items field I am interested in
# Unlist each element to get all nested emelents within the lists in flat format
dataList <- lapply(data[['items']], unlist)

# Combine all elemnts of list together
dataDF <- dplyr::bind_rows(dataList)

# Convert into data.frame
dataDF <- data.frame(dataDF)

This works, however the bind_rows part takes a long time

> system.time(dataDF <- dplyr::bind_rows(dataList))
   user  system elapsed 
 42.195   0.000  42.216 

It feels like there must be a quicker way of doing this.

I am told that data.table::rbindlist is a faster alternative but using this gives me error message

> dataDF <- data.table::rbindlist(dataList)
Error in data.table::rbindlist(dataList) : 
  Item 1 of input is not a data.frame, data.table or list

Had a suggestion of using do.call(rbind... in an answer which runs and is fast, but when there are fields which are only in some elements it does not handle this properly. For example

dataDF2 <- data.frame(do.call(rbind, dataList))
> head(dataDF$country)
[1] "Côte d'Ivoire" "Italy"         "England"       "Scotland"      "Germany"       "France"       
> head(dataDF2$country)
[1] "Côte d'Ivoire" "1.65"          "1.62"          "FALSE"         "2.59"          "France"    

Thanks in advance for any help

1

There are 1 best solutions below

2
On
data <- RJSONIO::fromJSON('https://api-prod.footballindex.co.uk/football.allTradable24hrchanges?page=1&per_page=5000&sort=asc')

system.time(dataDF <- as.data.frame(do.call(rbind, data[['items']])))

   user  system elapsed 
  0.007   0.000   0.006