garbage collection when reexecuting matplotlib plotting in jupyter

31 Views Asked by At

I'm plotting with matplotlib in a jupyter notebook. If I reexecute a cell which plots something, the used ram increases every time. After some testing, it seems that I need to manually keep track of created figure, axes and line objects, and garbage collect them.

Some years ago, I solved this problem, by just defining a figure with the unique identifier num, and calling plt.close() before re-creating the figure. However, it seems that this doesn't work anymore. Notably, for the %matplotlib inline backend, matplolib does not even seem to track the state of the figures in any way.

This can be seen in the following script, where the figure with identifier num is created, but directly afterwards not known to matplotlib. However, esplicetely garbage collecting figure, axes and line objects seems to do the trick most of the time.

%matplotlib inline
%reload_ext memory_profiler
%memit

import matplotlib.pyplot as plt
import numpy as np
import gc

num = 123
x = np.random.randn(3000000, 20)

print('\nbefore')
%memit

figure, axis = plt.subplots(num=num)
assert figure.number == num
lines = axis.plot(x)
plt.show()

print('\nplt knows fignum: ', plt.fignum_exists(num))

print('plt.close(num)')
plt.close(num)
%memit gc.collect()

print('\nclf')
figure.clf()
%memit gc.collect()

# necessary to free ram
print('\ndel figure')
del figure
%memit gc.collect()

print('\ndel axis')
del axis
%memit gc.collect()

print('\ndel lines')
del lines
%memit gc.collect()

print('\nafter')
%memit

If you execute this multiple times, the ram at some points stays the same, ie.

peak memory: 2282.13 MiB, increment: 0.00 MiB

before
peak memory: 2282.21 MiB, increment: 0.00 MiB

plt knows fignum:  False
plt.close(num)
peak memory: 4754.65 MiB, increment: 0.00 MiB

clf
peak memory: 4754.53 MiB, increment: 0.00 MiB

del figure
peak memory: 4754.53 MiB, increment: 0.00 MiB

del axis
peak memory: 4754.53 MiB, increment: 0.00 MiB

del lines
peak memory: 2282.59 MiB, increment: 0.00 MiB

after
peak memory: 2282.59 MiB, increment: 0.00 MiB

While, for %matplotlib widget, the figure is known to plt.fignum_exists(num), however, closing it has still no effect. The rest is around the same.

Question: Have I overlooked something? Is it really that complicated? At this point, the "simplest" solution seems to be to write some sort of plot manager object tracking the state of all this. When a figure with the same identifier is replotted, it could correctly garbage collect. So basically do manually, what matplotlib used to do.

0

There are 0 best solutions below