What is the ideal number of workers in django-rq and python-rq?

1.7k Views Asked by At

I have a question regarding django-rq. It is pip-installed library that functions as a small layer on top of python-rq, which runs on a redis instance. Currently, I run all of the jobs on the default queue that uses database 0 on my local redis instance.

I spin up workers by running the following script x times for x workers on the default queue:

nohup ~/.virtualenvs/prod/bin/python3 manage.py rqworker default &
nohup rq resume --url="redis://localhost:6379/0"

I am operating on an 8 core / 32GB RAM machine. For every incoming request, a worker is required to process the job, which can often take anywhere from 3 to 60 minutes in a background process, that uses OpenCV, Tesseract and a few other APIs, making a few HTTP requests along the way.

How do I know the ideal number of rq workers I should be using? I am looking at the administrative panel and it says 8 workers. Is this the ideal number of workers that I should be using? Should I use 20? How about 100?

How do I account for the following variables, in order to choose the correct number of workers I should spin up:

  1. number of incoming requests
  2. amount of RAM needed per process
  3. number of cores
  4. likelihood of a worker possibly breaking down
1

There are 1 best solutions below

0
On

Been using RQ for about a year now.

This answer relies COMPLETELY on what you're running. If you're CPU/memory intensive calculations, you obviously can't spin up a lot. For example, I do lots of number crunching so i run about 2, sometimes 3, RQ workers on 2gb RAM vps. Im not sure if this is for everyone, but running django RQ worker w/o the worker doing anything eats about 150mb RAM from getgo. Maybe i configured something wrong. When it actually processes job, sometimes RAM usage goes up as high as 700 MB per worker.

If you pack too many jobs, you get JobFailed error with no clear indication of why. Because of nature of RQ (asynchronous computing), you really can't tell unless you put in a ton of logging or have overhead of measuring & collecting cpu/memory usage. Either that, or run htop and see the utilization manually.

My recommendation:

  1. scale horizontally (less workers per server) instead of vertically (beefy machine w/tons of workers)
  2. limit # of execution time per job.. 100 1 minute jobs are better than 1 100 minute job
  3. use microdict and blist modules for large CSV / list processing... they are like 100x more efficient at RAM / CPU usage