I'm working on flattening a JSON string in BigQuery and encountered challenges when replacing None with null and removing single quotes from a free text attribute.
Here's the JSON string:
[{'a': '123', 'b': None, 'freetext': "I'm data" }]
To address these issues, I've crafted the following query:
WITH cte AS (
SELECT "[{'a': '123', 'b': None, 'freetext': "I'm data" }]" AS a
)
SELECT
JSON_EXTRACT_SCALAR(value, '$.a') AS a,
JSON_EXTRACT_SCALAR(value, '$.b') AS b,
JSON_EXTRACT_SCALAR(value, '$.freetext') AS freetext
FROM cte,
UNNEST(
JSON_EXTRACT_ARRAY(
REPLACE(
REPLACE(REPLACE(REPLACE(a, "None", "null"), "'", "!~!"), '"', "'"),
"!~!", ""
)
)
) AS value;
I initially attempted to use REGEXP_REPLACE to remove single quotes, but encountered challenges with the syntax in BigQuery.
try 1 : REGEXP_REPLACE(your_column, r'"([^"]*)''([^"]*)"', r'"\1\2"')
try 2 : REGEXP_REPLACE(your_column, r'"([^"]*)\'([^"]*)"', r'"\1\2"')
Any suggestions on how to further simplify or improve this query are appreciated!