Debugging memory leak in Python service that uses Redis pub/sub

1.8k Views Asked by At

I'm trying to build a Python 2.7 service using Redis publish/subscribe functionality. I use redis 2.8.17 on Ubuntu 12.04 with redis-py 2.10.3 as a client. Unfortunately my service seems to be leaking memory. The memory consumption seems to increase linearl-ish with the amount of messages the service receives/consumes/handles.

I tried to debug this using the tool memory_profiler by decorating my main subscribe loop. In order to have it print output continuously, I changed it to exits every every hundredth message it receives. The output looks like this:

Line #    Mem usage    Increment   Line Contents
================================================
    62     39.3 MiB      0.0 MiB       @memory_profiler.profile
    63                                 def _listen(self, callback):
    64     40.1 MiB      0.7 MiB           for _ in self.redis_pubsub.listen():
    65     40.1 MiB      0.0 MiB               self.count += 1
    66     40.1 MiB      0.0 MiB               self._consume(callback)
    67     40.1 MiB      0.0 MiB               if self.count == 100:
    68     40.1 MiB      0.0 MiB                   self.count = 0
    69     40.1 MiB      0.0 MiB                   break
    70     40.1 MiB      0.0 MiB           gc.collect()

It reports a similar increase for every hundred message pushed to the service. The callback is the function that actually does application things, so line 65 is where I'd actually expect a memory increase if there was something wrong in my app code ..

The output made me suspect the redis client so I also checked the size of the self.redis_pubsub and redis.StrictRedis objects using pympler.asizeof. These objects are small to begin with and does not increase at all as the service receives messages.

Further, when trying to look for leaking objects using pympler.muppy and pympler.summarize, it does not report any growing object-counts or accumulating memory whatsoever. Also, the total numbers for memory consumptions and growth does not resemble the numbers provided by top in Linux.

I'm stuck, do anyone have any idea what might be going on or have any ideas on how I can debug this further?

1

There are 1 best solutions below

0
On

I spent hours on debugging the same problem in a pub/sub setup. There's indeed a memory leak and I couldn't find a way to avoid it while publishing messages. My turnaround was running the publishing part on a separate process using multiprocessing. This worked for me because I publish messages every few seconds, so it was a reasonable tradeoff.

An alternative that worked without leaking was tornadis