Databricks Unity Catalog multiple metastore for same region

2.7k Views Asked by At

We have 3 databricks workspaces , one for dev, one for test and one for Production. All these workspaces are in the same region WestEurope.

All of our data is in the datalake, meaning external tables in databricks references the data in the data lake (Azure data lake gen 2).

Each of these workspaces thus have a different datalake associated with it (as they are for different environments).

Now, this does not cater for the usual Unity Catalog use case, where you have multiple workspaces referring to the same metastore, as e.g. we would have different access requirements for each environment, along with data. In some cases, certain tables may exist in lower environments and not in Prod.

Also, looking here, I see the following sentence

You can create one metastore per region and attach it to any number of workspaces in that region.

All our Databricks workspaces (for different environments) are in the same region , but different subscription.

Is it then that Unity Catalog, does not apply for this use case? Because that would mean, we create 3 different metastore for the same region.

If not, then how can we get goodies like

  1. Terraform capabilities which are only for unity catalog, e.g. create schema.
  2. Data Lineage
1

There are 1 best solutions below

3
On

This is how Unity Catalog works (at least right now) - each region may have only one Unity Catalog Metastore and all workspaces in that region could be attached to it.

Right now the problem with environment separation could be solved with the user groups. And you can set Azure Storage Firewall to limit access from the workspaces specific to a given environment.

And later this year there will be a feature that will allow to attach specific catalogs only to specific workspaces, so you can clearly separate environments. It was mentioned in the last quarter product roadmap and you can attend upcoming product roadmap webinar to get more updates about Unity Catalog.