How can I identify where or why a JSON is invalid (RJSONIO)

1.7k Views Asked by At

I'm dealing with a data column that is just massive JSON columns. Each row value is ~50,000 characters.

After spending some time trying to fiddle with fromJSON to go from JSON -> dataframe where columns = JSON keys, and getting numerous errors in doing so, I used isValidJSON() across the column and found that about 75% of my JSON is "invalid".

Now, I'm fully confident based on the source that this data is in fact valid JSON straight from the DB, so I would love to be able to identify where in the 50,000 characters the fromJSON function is running into trouble.

I've tried debug() but it just tells me at which function call the error occurs.

I'd share sample rows if they weren't all so cumbersome, but it's a healthy mix of values, imagine a df with df$features:

{"names":["bob","alice"],"ages":{"bob":20,"alice":21}, "id":54, "isTrue":false}... ad infinitum 

Code I'm trying to run:

iValid <- function(x){return(isValidJSON(I(x)))}
sapply(df$features,iValid)

 [1]  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE...

> fromJSON(df$features[2])
debugging in: fromJSON(df$features[2])
debug: standardGeneric("fromJSON")
Browse[2]> n
debugging in: fromJSON(content, handler, default.size, depth, allowComments, 
    asText = FALSE, data, maxChar, simplify = simplify, ..., 
    nullValue = nullValue, simplifyWithNames = simplifyWithNames, 
    encoding = encoding, stringFun = stringFun)
debug: standardGeneric("fromJSON")
Browse[3]> n
Error in fromJSON(content, handler, default.size, depth, allowComments,  : 
  invalid JSON input
> 
0

There are 0 best solutions below