Determining if a memory leak is occurring in Python

973 Views Asked by At

My understanding is that a memory leak in Python (> Cpython 2.0 at least) can only occur under the following circumstances:

  1. A circular reference graph that contains one or more objects with a __del__ method
  2. An extension C/C++ or other native code module that executes and leaks memory internally

Of course, we need to make a distinction between an actual memory leak (a program where objects can never be reclaimed via the Garbage Collector or regular reference counting) versus a program which simply runs out of memory because it keeps allocating objects which never die - (but remain in reach) - usually because their reference graph connects with some global variable.

In order to distinguish between these two circumstances (i.e. actual memory leak vs. program which just keeps allocating collectable objects which never go out of reach), can we simply continuously call gc.collect() and check that the return value is 0 ?

In other words, if the following program never fails with an AssertionError (due to the assertion in Thread 2) have we effectively proved that there is no memory leak (as defined above)?

Thread 1:
   ... run actual application code ...

Thread 2:
  while True:
    num = gc.collect()
    assert num == 0
    time.sleep(WAIT_TIME)

To be clear, I'm only asking if this program would prove that an actual memory leak, as defined by cases (1) and (2) above is NOT happening - I realize it wouldn't prove that the program will never run out of memory due to too many allocations.

1

There are 1 best solutions below

2
On

This program is very likely to throw an AssertionError even if there is no memory leak. It will do so any time the GC collects anything (gc.collect() returns nonzero in that case). OTOH, gc.collect() does not collect what you refer to as "actual memory leaks," so they will not be reported in the gc.collect() return value.

In short, no, this program will not detect memory leaks correctly at all. If you want to catch case (1), you can periodically check to make sure gc.garbage is empty, but that will not catch case (2), because the GC only becomes involved in managing an extension module's memory if the module asks (and even then, only to the extent that the module correctly tracks its owned references). You need something like Valgrind for the general case.