Missing hive partition key column while creating hive partition external table using bq command

1.9k Views Asked by At

I am using the bq version 2.0.58 and trying to create hive partitioning BigQuery external table. But it's missing hive partition key column while creating hive partition external table using bq command, the following command to create a table.

Storage:

gs://<bucket>/myfolder/size=1000/file_1k.csv
gs://<bucket>/myfolder/size=10000/file_10k.csv

The mkdef command,

bq mkdef --project_id=my-project --autodetect --source_format=CSV --hive_partitioning_mode=AUTO --hive_partitioning_source_uri_prefix=gs://<bucket>/myfolder gs://<bucket>/myfolder/* >my_table_def.json

The mk command, bq mk --external_table_definition=my_table_def.json my-project:my-dataset.mytable

The above command created a table with a hive partition key column. But When I add a schema into it. It's missing to add the partition key (size) column while creating the table.

The following bq command missing to create partition key column while creating the table.

bq mk --schema=ip:STRING,fraudype:STRING,probability:FLOAT --external_table_definition=my_table_def.json my-project:my-dataset.mytable

my_table_def.json,

{
   "csvOptions": {
     "encoding": "UTF-8",
     "quote": "\""
   },
   "hivePartitioningOptions": {
     "mode": "AUTO",
     "sourceUriPrefix": "gs://<bucket>/myfolder"
   },
   "sourceFormat": "CSV",
   "sourceUris": [
     "gs://<bucket>/myfolder/*"
   ]
 }
1

There are 1 best solutions below

0
On

The --schema command should not be used with bq mk for hive partitioned external table definitions, mkdef will help.

Solution:

bq mkdef --source_format=CSV \
--hive_partitioning_mode=CUSTOM \
--hive_partitioning_source_uri_prefix=gs://<bucket>/myfolder/{size:INTEGER} \
gs://<bucket>/myfolder/* \
ip:STRING,fraudype:STRING,probability:FLOAT >my_table_def.json

bq mk --external_table_definition=my_table_def.json my-project:my-dataset.mytable