Flatten JSON String in BigQuery and Handle Null and Single Quotes

40 Views Asked by At

I'm working on flattening a JSON string in BigQuery and encountered challenges when replacing None with null and removing single quotes from a free text attribute.

Here's the JSON string:

[{'a': '123', 'b': None,  'freetext': "I'm data" }]

To address these issues, I've crafted the following query:

WITH cte AS (
  SELECT "[{'a': '123', 'b': None, 'freetext': "I'm data" }]" AS a 
)
SELECT
  JSON_EXTRACT_SCALAR(value, '$.a') AS a,
  JSON_EXTRACT_SCALAR(value, '$.b') AS b,
  JSON_EXTRACT_SCALAR(value, '$.freetext') AS freetext
FROM cte,
  UNNEST(
    JSON_EXTRACT_ARRAY(
      REPLACE(
        REPLACE(REPLACE(REPLACE(a, "None", "null"), "'", "!~!"), '"', "'"),
        "!~!", ""
      )
    )
  ) AS value;

I initially attempted to use REGEXP_REPLACE to remove single quotes, but encountered challenges with the syntax in BigQuery.

  try 1 : REGEXP_REPLACE(your_column, r'"([^"]*)''([^"]*)"', r'"\1\2"')  
  try 2 : REGEXP_REPLACE(your_column, r'"([^"]*)\'([^"]*)"', r'"\1\2"')  

Any suggestions on how to further simplify or improve this query are appreciated!

0

There are 0 best solutions below