Multiprocess executors in Python

174 Views Asked by At

I am trying to run a report on my storage of Azure Gen2 Data lake. I have written a below recursive function that goes inside every folder and list files till last level.

def recursive_ls(path: str):
  
    """List all files from path recursively."""
    for file in dbutils.fs.ls(path):
        if file.path[-1] is not '/':
            yield (file.path.split('/')[3:11],file.size)
        else:
            for folder in recursive_ls(file.path):
                yield folder

I have very with huge number of files and as a result this function is not coming even after 2 hours.

This might be happening because it currently handled by one single process. I need some way where in I can execute these executor functions in a multiprocessing environment.

0

There are 0 best solutions below