Is it possible to connect bigquery
to hive/dataproce metastore
database? I don't want to load hive tables(orc or parquet) into bigquery internal storage. If bigquery can route its sql to hive and then hive runs query on spark that works. I considered using Hive CLI
instead of bigquery to execute queries but being able to do it via bigquery will allow unified interface to execute ad-hoc sqls. I also considered external tables in big query which can directly points to raw parquet/orc
locations. However orc tables are also ACID tables managed by hive so bigquery directly accessing raw Orc dataset may result in inconsistent reads.
Connecting BigQuery to Dataproc Metastore/hive tables
371 Views Asked by nir At
2
There are 2 best solutions below
1

It is possible to connect Hive/Dataproc to BigQuery or vice versa by using Spark BigQuery Connector. Take note that SparkSQL supports Hive and not BigQuery even though BigQuery reads and writes using spark-bigquery-connector.
0

I was able to achieve this by using Biglake
metastore catalog. I stumble upon this document as I was also looking to expose apache iceberg
external tables to bigquery. It seems you can use same catalog to export Hive (or dataproc metastore) tables to bigquery as well.