I'm plotting with matplotlib in a jupyter notebook. If I reexecute a cell which plots something, the used ram increases every time. After some testing, it seems that I need to manually keep track of created figure, axes and line objects, and garbage collect them.
Some years ago, I solved this problem, by just defining a figure with the unique identifier num
, and calling plt.close()
before re-creating the figure. However, it seems that this doesn't work anymore. Notably, for the %matplotlib inline
backend, matplolib does not even seem to track the state of the figures in any way.
This can be seen in the following script, where the figure with identifier num
is created, but directly afterwards not known to matplotlib. However, esplicetely garbage collecting figure, axes and line objects seems to do the trick most of the time.
%matplotlib inline
%reload_ext memory_profiler
%memit
import matplotlib.pyplot as plt
import numpy as np
import gc
num = 123
x = np.random.randn(3000000, 20)
print('\nbefore')
%memit
figure, axis = plt.subplots(num=num)
assert figure.number == num
lines = axis.plot(x)
plt.show()
print('\nplt knows fignum: ', plt.fignum_exists(num))
print('plt.close(num)')
plt.close(num)
%memit gc.collect()
print('\nclf')
figure.clf()
%memit gc.collect()
# necessary to free ram
print('\ndel figure')
del figure
%memit gc.collect()
print('\ndel axis')
del axis
%memit gc.collect()
print('\ndel lines')
del lines
%memit gc.collect()
print('\nafter')
%memit
If you execute this multiple times, the ram at some points stays the same, ie.
peak memory: 2282.13 MiB, increment: 0.00 MiB
before
peak memory: 2282.21 MiB, increment: 0.00 MiB
plt knows fignum: False
plt.close(num)
peak memory: 4754.65 MiB, increment: 0.00 MiB
clf
peak memory: 4754.53 MiB, increment: 0.00 MiB
del figure
peak memory: 4754.53 MiB, increment: 0.00 MiB
del axis
peak memory: 4754.53 MiB, increment: 0.00 MiB
del lines
peak memory: 2282.59 MiB, increment: 0.00 MiB
after
peak memory: 2282.59 MiB, increment: 0.00 MiB
While, for %matplotlib widget
, the figure is known to plt.fignum_exists(num)
, however, closing it has still no effect. The rest is around the same.
Question: Have I overlooked something? Is it really that complicated? At this point, the "simplest" solution seems to be to write some sort of plot manager object tracking the state of all this. When a figure with the same identifier is replotted, it could correctly garbage collect. So basically do manually, what matplotlib used to do.