Azure Databricks Architecture - Communication between Control plane and data plane and authentications

776 Views Asked by At

I am trying to understand on Azure Databricks Architecture based on the this link. I could understand what is the purpose of control plane and data plane in Azure Databricks architecture.But I could't understand on the following questions .

  • How control plane and data plane will be communicating?

  • How control plane and data plane would be authenticate ?

1

There are 1 best solutions below

0
On

There are two ways of communication between control plane & data plane:

  1. Legacy - when VMs running on the data plane should have the public IPs, and control plane reaches them directly. This way was always a security headache. Azure still supports it & shows in the UI, but it shouldn't be used
  2. "No Public IP (NPIP)" or another name "Secure Cluster Connectivity" (doc and more technical details). In this case, when VMs in the data plane are starting, they are opening a bi-directional tunnel to a relay on the control plane, and it's always used for controlling VMs & Spark. In this setup, VMs don't need public IPs, and it's much more secure & easy to control.

Regarding authentication - it's internal detail, but it provides a way of ensuring that VMs that are communicating with control plane are really that VMs that form a cluster.