I have a sample code of FastAPI that utilizes ProcessPoolExecutor during start up:
@asynccontextmanager
async def lifespan(app: FastAPI):
applog.info(f"{ENGINE_MODULENAME} app is starting executor...")
process_pool_size = constant.PROCESS_POOL_SIZE if constant.PROCESS_POOL_SIZE > 0 else None
app.state.executor = ProcessPoolExecutor(process_pool_size)
applog.info(f"Executor initiated with {app.state.executor._max_workers} workers.")
preloaded_models = {}
app.state.model = preloaded_models
applog.info("Application loaded with serving models.")
preloaded_data = {}
app.state.data = preloaded_data
applog.info("Application loaded with serving data.")
yield
del preloaded_models
del app.state.model
applog.info("Application removed serving models.")
del preloaded_data
del app.state.data
applog.info("Application removed serving data.")
applog.info(f"{ENGINE_MODULENAME} app is shutting down executor...")
app.state.executor.shutdown()
applog.info("Executor shut down successfully.")
app = FastAPI(title=__name__, lifespan=lifespan)
I will start the application via gunicorn:
gunicorn --bind :8002 --worker-class uvicorn.workers.UvicornWorker --workers 4 --timeout 300 --preload --chdir interface app.main:app
For each gunicorn worker, the same logs will print that Executor initiated with the max number of workers (in this case, ProcessExecutor workers
Would this behaviour overload my application and cause it to crash since all 4 workers are printing the same logs?
What is the correct way of combining ProcessExecutor code with Gunicorn to reach the optimal load management in an environment with multiple CPU cores?
I have tried running the above code but did not put it through load testing yet as the process is very expensive in the current workflow designed.