Google Cloud Data Catalog - Offerings and Flexibility

Question

Google Cloud Data Catalog - Offerings and Flexibility

103 Views Asked by steve At 28 July 2025 at 02:44

Planning to build a data platform with compute as Google Cloud Dataproc storing the data in delta tables (Deltalake).

Currently exploring the data catalog available in GCP stack along with open source Hive meta store and would like to clarify below questions:

what is the difference between Google Cloud Data Catalog and Dataproc Metastore (https://cloud.google.com/dataproc-metastore/docs) ? Coming from AWS world, what is the equivalent of AWS Glue data catalog in GCP?
If we migrate the application from GCP to other spark platforms (for ex: Databricks and any other), can we port/reuse the GCP data catalog/dataproc metastore already craeted?
Where is the data catalog/dataproc metastore metadata stored? Is this GCS or any other storage?
As per the documentation (https://cloud.google.com/data-catalog/docs/concepts/overview) , Google data catalog automatically catalogs the data in GCS,Bigquery,Pub/Sub. Does data catalog/dataproc metastore automatically captures metadata for delta tables on Google platform?

Original Q&A

There are 1 best solutions below

**George Verghese** · Answer 1

Difference between catalog and Dataproc metastore:

Google cloud catalog allows users to discover, understand and curate metadata for data assets that spread across various systems/sources - https://cloud.google.com/data-catalog/docs/concepts/overview
Dataproc is a fully managed and highly scalable service for running Apache Hadoop, Apache Spark, Apache Flink, Presto, and 30+ open source tools and frameworks - https://cloud.google.com/dataproc/docs/concepts/overview

If we migrate the application from GCP to other spark platforms (for ex: Databricks and any other), can we port/reuse the GCP data catalog/dataproc metastore already craeted?

You should be able to ideally use the Dataproc metastore

Where is the data catalog/dataproc metastore metadata stored? Is this GCS or any other storage?

Both are Google proprietary native services - you would need to export out the metadata from DPMS / Google cloud catalog.

Does data catalog/dataproc metastore automatically captures metadata for delta tables on Google platform?

No

Google Cloud Data Catalog - Offerings and Flexibility

There are 1 best solutions below

Related Questions in GOOGLE-CLOUD-PLATFORM

Related Questions in GOOGLE-CLOUD-DATAPROC

Related Questions in HIVE-METASTORE

Related Questions in GOOGLE-DATA-CATALOG

Trending Questions

Popular # Hahtags

Popular Questions