I am using azure blob storage to store data and feeding this data to Autoloader using mount. I was looking for a way to allow Autoloader to load a new file from any mount. Let's say I have these folders in my mount:
mnt/
├─ blob_container_1
├─ blob_container_2
When I use .load('/mnt/') no new files are detected. But when I consider folders individually then it works fine like .load('/mnt/blob_container_1')
I want to load files from both mount paths using Autoloader (running continuously).
You can use the path for providing prefix patterns, for example:
For example, if you would like to parse only png files within a directory that contains files with different suffixes, you can do:
Refer – https://docs.databricks.com/spark/latest/structured-streaming/auto-loader.html#filtering-directories-or-files-using-glob-patterns