Currently I have 2 containers. The main container is an R Shiny application that sends a Post request to the second container, which is a Python Flask API container. The Python container loads some inputs, performs image preprocessing, runs an .onnx image classification model and writes a prediction .csv to a mounted volumes folder.

When running the process on +- 2000 images, the process works fine and completes in under a minute. When running it on +- 130,000 images, the container exits with code 247 indicating that there's a memory issue in Docker. This happens during the image preprocessing stage of the process.

What I've done so far:

  • Increased the Memory resource to 14GB

  • Revamped the image preprocessing implementation multiple times trying batching (and different batch sizes), generators and using mmap to access images.

Currently I'm just using glob to create a generator with the image file paths and then feeding the file paths into the preprocessing function

When I run the process on the large image folder (~130,000) the Docker stats show that the container idles around 12GB before exiting with code 247.

Are there any major changes that I can try implement that might help with memory usage? Docker settings, python code etc.

Any help would be appreciated thanks.

Docker desktop version: 4.25.2, Docker image Python version: 3.8, PC: MacOS Senoma 14.1.1

0

There are 0 best solutions below