celery failing on dotcloud deployment with IO Error

Celery is failing on one of my dotcloud deployments, and I'm not sure how to fix. The deployment is almost identical to an existing dotcloud deployment (verified via doing a file diff) which seems to be working ok.

The error I get in djcelery log:

dotcloud@hack-default-www-0:/var/log/supervisor$ more djcelery_error.log
/home/dotcloud/env/lib/python2.6/site-packages/django/conf/__init__.py:75: Depre
cationWarning: The ADMIN_MEDIA_PREFIX setting has been removed; use STATIC_URL i
  "use STATIC_URL instead.", DeprecationWarning)
/home/dotcloud/env/lib/python2.6/site-packages/djcelery/loaders.py:108: UserWarn
ing: Using settings.DEBUG leads to a memory leak, never use this setting in prod
uction environments!
  warnings.warn("Using settings.DEBUG leads to a memory leak, never "
[2012-06-04 03:27:32,139: WARNING/MainProcess] -------------- celery@hack-defaul
t-www-0 v2.5.3
---- **** -----
--- * ***  * -- [Configuration]
-- * - **** ---   . broker:      amqp://[email protected]:29210//
- ** ----------   . loader:      djcelery.loaders.DjangoLoader
- ** ----------   . logfile:     [stderr]@INFO
- ** ----------   . concurrency: 2
- ** ----------   . events:      ON
- *** --- * ---   . beat:        OFF
-- ******* ----
--- ***** ----- [Queues]
 --------------   . celery:      exchange:celery (direct) binding:celery

  . experiments.tasks.pushMessageToIphone
  . experiments.tasks.sendTestMessage
[2012-06-04 03:27:32,172: INFO/PoolWorker-1] child process calling self.run()
[2012-06-04 03:27:32,185: INFO/PoolWorker-2] child process calling self.run()
[2012-06-04 03:27:32,188: WARNING/MainProcess] celery@hack-default-www-0 has sta
[2012-06-04 03:27:35,315: ERROR/MainProcess] Consumer: Connection Error: Socket
closed. Trying again in 2 seconds...
[2012-06-04 03:27:40,374: ERROR/MainProcess] Consumer: Connection Error: Socket
closed. Trying again in 4 seconds...
[2012-06-04 03:27:47,479: ERROR/MainProcess] Consumer: Connection Error: Socket
closed. Trying again in 6 seconds...
[2012-06-04 03:27:56,509: ERROR/MainProcess] Consumer: Connection Error: Socket

Interestingly, the error log of celery cam shows something a bit different. I'm not sure if this is a red herring..

/home/dotcloud/env/lib/python2.6/site-packages/django/conf/__init__.py:75: Depre
cationWarning: The ADMIN_MEDIA_PREFIX setting has been removed; use STATIC_URL i
  "use STATIC_URL instead.", DeprecationWarning)
[2012-06-04 03:27:31,373: INFO/MainProcess] -> evcam: Taking snapshots with djce
lery.snapshot.Camera (every 1.0 secs.)

Traceback (most recent call last):
  File "hack/manage.py", line 14, in 
  File "/home/dotcloud/env/lib/python2.6/site-packages/django/core/management/__
init__.py", line 459, in execute_manager
  File "/home/dotcloud/env/lib/python2.6/site-packages/django/core/management/__
init__.py", line 382, in execute
  File "/home/dotcloud/env/lib/python2.6/site-packages/djcelery/management/base.
py", line 74, in run_from_argv
    return super(CeleryCommand, self).run_from_argv(argv)
  File "/home/dotcloud/env/lib/python2.6/site-packages/django/core/management/ba
se.py", line 196, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/home/dotcloud/env/lib/python2.6/site-packages/djcelery/management/base.
py", line 67, in execute
    super(CeleryCommand, self).execute(*args, **options)
  File "/home/dotcloud/env/lib/python2.6/site-packages/django/core/management/ba
se.py", line 232, in execute
    output = self.handle(*args, **options)
  File "/home/dotcloud/env/lib/python2.6/site-packages/djcelery/management/comma
nds/celerycam.py", line 26, in handle
    ev.run(*args, **options)
  File "/home/dotcloud/env/lib/python2.6/site-packages/celery/bin/celeryev.py",
line 38, in run
  File "/home/dotcloud/env/lib/python2.6/site-packages/celery/bin/celeryev.py",
line 70, in run_evcam
    return cam()
  File "/home/dotcloud/env/lib/python2.6/site-packages/celery/events/snapshot.py
", line 116, in evcam
  File "/home/dotcloud/env/lib/python2.6/site-packages/celery/events/__init__.py
", line 204, in capture
    list(self.itercapture(limit=limit, timeout=timeout, wakeup=wakeup))
  File "/home/dotcloud/env/lib/python2.6/site-packages/celery/events/__init__.py
", line 193, in itercapture
    with self.consumer(wakeup=wakeup) as consumer:
  File "/usr/lib/python2.6/contextlib.py", line 16, in __enter__
    return self.gen.next()
  File "/home/dotcloud/env/lib/python2.6/site-packages/celery/events/__init__.py
", line 185, in consumer
    queues=[self.queue], no_ack=True)
  File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/messaging.py", line
 279, in __init__
  File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/messaging.py", line
 286, in revive
    channel = channel.default_channel
  File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/connection.py", lin
e 581, in default_channel
  File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/connection.py", lin
e 574, in connection
    self._connection = self._establish_connection()
  File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/connection.py", lin
e 533, in _establish_connection
    conn = self.transport.establish_connection()
  File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/transport/amqplib.p
y", line 279, in establish_connection
  File "/home/dotcloud/env/lib/python2.6/site-packages/kombu/transport/amqplib.p
y", line 89, in __init__
    super(Connection, self).__init__(*args, **kwargs)
  File "/home/dotcloud/env/lib/python2.6/site-packages/amqplib/client_0_8/connec
tion.py", line 144, in __init__
    (10, 30), # tune
  File "/home/dotcloud/env/lib/python2.6/site-packages/amqplib/client_0_8/abstra
ct_channel.py", line 95, in wait
    self.channel_id, allowed_methods)
  File "/home/dotcloud/env/lib/python2.6/site-packages/amqplib/client_0_8/connec
tion.py", line 202, in _wait_method
  File "/home/dotcloud/env/lib/python2.6/site-packages/amqplib/client_0_8/method
_framing.py", line 221, in read_method
    raise m
IOError: Socket closed

My supervisord file:

directory = /home/dotcloud/current/
command = /home/dotcloud/env/bin/python hack/manage.py celeryd -E -l info -c 2
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log

directory = /home/dotcloud/current/
command = /home/dotcloud/env/bin/python hack/manage.py celerycam
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log

As mentioned, I have nearly identical code deployed under a different dotcloud account that is working fine.

Status of the rabbitmq broker:

$ ./dotcloud info hack.broker
- hackxxxx.dotcloud.com
    password: xxxx
    rabbitmq_management: true
    user: root
created_at: 1338702527.075196
datacenter: Amazon-us-east-1c
image_version: 924a079b622a (latest)
memory: 49M/512M (9%)
-   name: ssh
    url: ssh://[email protected]:29209
-   name: amqp
    url: amqp://root:[email protected]:29210
-   name: http
    url: http://root:[email protected]/
state: running
type: rabbitmq

There are 1 best solutions below


It looks like it is having an issue connection to your broker. Have you confirmed that you can connect to your broker, and it is up and running?

What are you using for a broker?