I'm trying to extract an array of industry code descriptions from the OpenCorporates.com JSON output using OpenRefine. I've extracted the industry_codes
array from the JSON body into a new column. Some records have a full array, some just have [ ]
. Below is an example of a populated array.
I want to extract an array of the industry descriptions, so the output I'm looking for from the example below is ['Other accommodation','Other food service activities','Other accommodation']
. I'm trying to do that by using a forEach
control, as follows:
grel:forEach(value.parseJson(),v,v.industry_code.description)
This provides the right output at preview.
but when I run the script into OpenRefine I get empty cells. If I ask OpenRefine to store the error I get the following red error message:
Object[] value not storable
.
I've tried faceting out the empty arrays but that didn't work, the JSON is validating correctly, so I'm not sure what is going wrong.
[
{
"industry_code": {
"code": "55.90",
"description": "Other accommodation",
"code_scheme_id": "uk_sic_2007",
"code_scheme_name": "UK SIC Classification 2007",
"uid": "uk_sic_2007-5590"
}
},
{
"industry_code": {
"code": "56.29",
"description": "Other food service activities",
"code_scheme_id": "uk_sic_2007",
"code_scheme_name": "UK SIC Classification 2007",
"uid": "uk_sic_2007-5629"
}
},
{
"industry_code": {
"code": "55.90",
"description": "Other accommodation",
"code_scheme_id": "eu_nace_2",
"code_scheme_name": "European Community NACE Rev 2",
"uid": "eu_nace_2-5590"
}
}
]
It's not the best error message, but what it's trying to tell you is that arrays can't be stored, whether they contain values or are empty.
One easy solution would be to convert array of nested dictionaries back to a JSON string before storing it.