How can we interact with Dataproc Metastore to fetch list of databases and tables?

1k Views Asked by At

I am using Dataproc metastore as a Metastore service with GCP. How can I interact with it to fetch list of databases and tables from it? Is it possible to do this without running dataproc cluster ?

Edit - I have to fetch the metadata without running Dataproc cluster. Since I am using Dataproc Metastore service to store metadata, I need to fetch metadata directly from it.

1

There are 1 best solutions below

0
On

The Dataproc Metastore API is used to manage the Dataproc Metastore service instance (get/create/update etc). As mentioned in one of the comments, you can use the thrift URI (you will find the URI under the configuration tab of the metastore service if you are using the console).

Once you have a thrift client that connects to the thrift URI, you can fetch databases or tables. Although you can use the thrift API to create databases and tables as well, the typical use case is to configure a big data processing engine/framework like spark or hive to use the metastore and not directly interact with the metastore.