Airflow - BigQuery Invalid value at schema fields

2.1k Views Asked by At

I have a GCP cloud composer to load the data into BQ from GCS. Im using schema_fields option to pass the source schema. Im passing the source schema in a variable. Im getting this from xcom pull. See the print(schema) below.

[{"mode": "NULLABLE", "name": "id", "type": "INTEGER"}, {"mode": "NULLABLE", "name": "c1", "type": "DATE"}, {"mode": "NULLABLE", "name": "c2", "type": "TIME"}, {"mode": "NULLABLE", "name": "c3", "type": "DATETIME"}, {"mode": "NULLABLE", "name": "c4", "type": "TIMESTAMP"}]

And in BQ operator I use schema_fields=schema.

But when I run the dag, its throwing error.

ERROR - <HttpError 400 when requesting https://bigquery.googleapis.com/bigquery/v2/projects/xxx-xxx/jobs?
alt=json returned "Invalid value at 'job.configuration.load.schema.fields' (type.googleapis.com/google.cloud.bigquery.v2.TableFieldSchema), 
"[{"mode": "NULLABLE", "name": "id", "type": "INTEGER"}, {"mode": "NULLABLE", "name": "c1", "type": "DATE"}, {"mode": "NULLABLE", "name": "c2", "type": "TIME"}, {"mode": "NULLABLE", "name": "c3", "type": "DATETIME"}, {"mode": "NULLABLE", "name": "c4", "type": "TIMESTAMP"}]"">

But when I save this schema as a file in GCS and tried with schema_object then it worked. But the same thing via variable didn't work.

2

There are 2 best solutions below

0
On BEST ANSWER

Im able to solve this by using json.loads

schema=json.loads(xcom_pull commands)
0
On

Schema_fields parameter accepts a valid list. It's not templated as of now.

I had used this sample fields and it is working. Please check in the similar format and see if it works for you.

schema_fields=[
                 {'name': 'Col1', 'type': 'STRING', 'mode': 'NULLABLE'},
                 {'name': 'Col2', 'type': 'STRING', 'mode': 'NULLABLE'},
             ]