Make supervisor stop Celery workers correctly

4.8k Views Asked by At

I meet a lot weird thing when using celery. Such as, I update tasks.py, supervisorctl reload(restart), but tasks is wrong. Some tasks seems disappear and so on.
Today I found that because supervisorctl stop all can not stop all celery workers. And only kill -9 'pgrep python' can kill them all.

situation:

    root@ubuntu12:/data/www/article_fetcher# supervisorctl
    celery_beat                      RUNNING    pid 29597, uptime 0:52:18
    celery_worker1                   RUNNING    pid 29556, uptime 0:52:20
    celery_worker2                   RUNNING    pid 29570, uptime 0:52:19
    celery_worker3                   RUNNING    pid 29557, uptime 0:52:20
    celery_worker4                   RUNNING    pid 29586, uptime 0:52:18
    uwsgi                            RUNNING    pid 29604, uptime 0:52:18
    supervisor> stop all
    celery_beat: stopped
    celery_worker2: stopped
    celery_worker4: stopped
    celery_worker3: stopped
    uwsgi: stopped
    celery_worker1: stopped
    supervisor> status
    celery_beat                      STOPPED    Aug 04 11:05 AM
    celery_worker1                   STOPPED    Aug 04 11:05 AM
    celery_worker2                   STOPPED    Aug 04 11:05 AM
    celery_worker3                   STOPPED    Aug 04 11:05 AM
    celery_worker4                   STOPPED    Aug 04 11:05 AM
    uwsgi                            STOPPED    Aug 04 11:05 AM

processes:

root@ubuntu12:~# ps -aux|grep 'python'
Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html
root      8683  0.0  0.1  61420 11768 ?        Ss   Aug03   0:27 /usr/bin/python /usr/bin/supervisord
root     29310  0.1  0.1  57120 11344 pts/2    S+   11:05   0:00 /usr/bin/python /usr/bin/supervisorctl
nobody   29556  2.2  0.5 132484 45988 ?        S    11:06   0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody   29557  2.2  0.5 132480 45996 ?        S    11:06   0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody   29570  2.4  0.5 132740 45996 ?        S    11:06   0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
nobody   29571 26.9  1.4 217688 115804 ?       R    11:06   0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody   29572 33.7  0.7 158396 59808 ?        R    11:06   0:12 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody   29573 29.6  1.4 215176 115928 ?       R    11:06   0:10 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody   29574 27.2  1.4 218244 118180 ?       R    11:06   0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
......
......
......

I found this question:Stopping Supervisor doesn't stop Celery workers, but it is asking different thing, the accepted answer supervisorctl stop all do not work actually.So I decide find the right way.

1

There are 1 best solutions below

0
On BEST ANSWER

I look into supervisor docs and find this:

killasgroup

If true, when resorting to send SIGKILL to the program to terminate it send it to its whole process group instead, taking care of its children as well, useful e.g with Python programs using multiprocessing.

Default: false

Required: No.

Introduced: 3.0a11

Then I think that each worker create 4 child process(by cpu cores) become a process group, that's why supervisorctl stop all do not work.
So I add killasgroup to supervisord.conf:

    [program:celery_worker1]
    ; Set full path to celery program if using virtualenv

    directory=/data/www/article_fetcher

    command=/data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
    user=nobody
    numprocs=1
    stdout_logfile=/data/www/article_fetcher/logs/celery.log
    stderr_logfile=/data/www/article_fetcher/logs/celery.log
    autostart=true
    autorestart=true
    startsecs=5
    killasgroup=true

    .....
    .....

Then supervisorctl stop all really stop celery workers! very well~