Celery worker dies after random time with IronMQ

427 Views Asked by At

Running Django 1.4.10, Celery 3.1.7, Python 2.7, Kombu 3.0.12, and using django supervisor to run both celery worker and celery beat as a daemon. Using IronMQ as the broker.

Everything works fine processing periodic tasks for a number of days, then all of a sudden the worker will just die fatally:

[2014-08-04 16:52:17,647: ERROR/MainProcess] Unrecoverable error: TypeError("sequence index must be integer, not 'unicode'",)
Traceback (most recent call last):
  File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/worker  /__init__.py", line 206, in start
self.blueprint.start(self)
  File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/bootsteps.py", line 123, in start
step.start(parent)
  File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/bootsteps.py", line 373, in start
return self.obj.start()
  File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/worker/consumer.py", line 270, in start
blueprint.start(self)
  File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/bootsteps.py", line 123, in start
step.start(parent)
  File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/worker/consumer.py", line 786, in start
c.loop(*c.loop_args())
  File "/home/kromedev/webapps/gorilla/lib/python2.7/celery-3.1.7-py2.7.egg/celery/worker/loops.py", line 99, in synloop
connection.drain_events(timeout=2.0)
  File "/home/kromedev/webapps/gorilla/lib/python2.7/kombu-3.0.12-py2.7.egg/kombu/connection.py", line 279, in drain_events
return self.transport.drain_events(self.connection, **kwargs)
  File "/home/kromedev/webapps/gorilla/lib/python2.7/kombu-3.0.12-py2.7.egg/kombu/transport/virtual/__init__.py", line 844, in drain_events
self._callbacks[queue](message)
  File "/home/kromedev/webapps/gorilla/lib/python2.7/kombu-3.0.12-py2.7.egg/kombu/transport/virtual/__init__.py", line 529, in _callback
message = self.Message(self, raw_message)
  File "/home/kromedev/webapps/gorilla/lib/python2.7/kombu-3.0.12-py2.7.egg/kombu/transport/virtual/__init__.py", line 242, in __init__
properties = payload['properties']
TypeError: sequence index must be integer, not 'unicode'

Looking at the timestamp, it always seems to crash way after the tasks are all finished. As I have my supervisor.conf file to autorestart, it then goes into a loop trying to restart the worker and failing instantly. However, if I shut down all processes with supervisor and restart, it all works again for a few days.

Supervisor.conf:

[program:celeryd]
command={{ PYTHON }} {{ PROJECT_DIR }}/manage.py celery worker -l info
numprocs=1
stdout_logfile={{ PROJECT_DIR }}/worker.log
stderr_logfile={{ PROJECT_DIR }}/worker.log
autostart=true
autorestart=true
startsecs=10
stopwaitsecs=600
killasgroup=true
priority=998

[program:celerybeat]
command={{ PYTHON }} {{ PROJECT_DIR }}/manage.py celery beat  -l INFO
numprocs=1
stdout_logfile={{ PROJECT_DIR }}/beat.log
stderr_logfile={{ PROJECT_DIR }}/beat.log
autostart=true
autorestart=true
startsecs=10
priority=999 

No idea where to even begin to debug this one - throwing it out there incase anyone else has had the same issue and can throw some light on it.

UPDATE: I switched back to trusty Erlang/Rabbit and the issue completely disappeared. So it's IronMQ. I couldn't really get any help from them about it, so they lost a customer!

0

There are 0 best solutions below