What is the purpose of the "host" blob in azure-webjobs-hosts storage container with Azure Functions?

434 Views Asked by At

I have an Azure Function App with Python functions that have Service Bus trigger enabled. I noticed that within the storage account associated with the Function App, there is a storage container named azure-webjobs-hosts, that has a folder named locks in it. It has several folders inside named exactly like my Function App (or my App Service plan, not sure, because they are named the same), and each of them has a hosts file inside. My function apps are scaled out to multiple instances. Whenever I check the hosts file for one of my Function Apps, the metadata of the blob always contains a single entry, like this. I've spent countless hours trying to understand what happens under the hood, as I have multiple instances running, but there is only a single "FunctionInstance" leasing this blob. Does this affect other instances in processing my Service Bus queue messages? Why is this blob leased by only one of the instances all the time? (If I restart the Function App, it might get leased by another instance ID after restart, so I guess it is typically the ID of one of the instances for that Function App, but it does stay leased by a single instance according to the metadata) Even if I am not affected here, could you provide an example where this hosts file is used by Azure Functions, and why it starts by trying to lock on startup?

I started to experiment locally to better understand what is happening. What I found is the following: The first function that I start (with func start) makes a HEAD request to this blob, then makes a PUT request to this blob with comp=lease URL parameter, then another HEAD request, and finally after another PUT request with comp=metadata URL parameter I get the following log line: Host lock lease acquired by instance ID '000000000000000000000000711A638F'. After this, it starts to process my Service Bus messages from the queue. The second function that I start on this machine however, never gets the Host lock lease acquired by instance ID 'xxx'. message, it periodically sends a HEAD request to this blob, and seems like it never acquires the blob for lease. However, this second function is also triggered by the messages in the queue and processes them as intended just like the first one.

Please explain the logic behind the leasable hosts file. (PS: I have my Function App scaled to 3 instances and I have Azure Functions Premium Plan)

1

There are 1 best solutions below

4
On
  • The host lock lease files are kept in the azure-webjobs-hosts container in the storage account linked to your Function App.

  • These files are needed to make sure that your Function App is only processing messages from the Service Bus queue in one instance at a time. A Function App instance tries to obtain a lease on the host lock file upon startup. It begins analyzing messages from the queue if it is successful. If it doesn't work, it checks the status of the lease on a regular basis and begins processing messages as soon at the lease becomes available.

  • Within the azure-webjobs-hosts container, the locks folder has folders named for each Function App instance. A hosts file tha has the instance's lease information is present in every folder. Because only one instance of your Function App can hold the lease at once, the blob's metadata always has one entry.

  • Azure Functions implements the singleton behavior for your Function App using the locks folder. Only one instance of your Function App will be processing messages from the Service Bus queue at a time because of the singleton behavior.

  • The host ID collision detection feature is also implemented using the locks folder. The Functions runtime 3.x version onwards detects host ID collision and logs a warning. Version 4.x causes a hard failure by logging an error and stopping the host. You may get more information on host ID collision here

enter image description here

enter image description here

According to this Blog:-

The azure-webjobs-host container, in turn, hosts three folders:-

  • Heartbeats - Contains a 0-byte log of every heartbeat check performed on the service.
  • Ids - Contains a single blog directory with a unique identifier for this service.
  • Output Logs - Hosts the output of verbose logs for each run. Explicit logs are logs added to the runtime code by the WebJob developers.