Can utilising Numpy functions inside the TensorFlow pipeline lead to memory leaks?

37 Views Asked by At

While training models in a loop using TensorFlow 2.14, the GPU RAM in use gradually increases until it reaches a maximum value at some iteration, at which point the process is killed.

So far, I've tried many solutions I've found (such as resetting the graph, training inside separate processes using Multiprocessing or closing CUDA with Numba). I ended up simply running my Python code in a loop from within bash.

While in theory this is some sort of solution, the issue leaves me confused. I have come across mentions that TensorFlow should not be used together with Numpy:

At the second link, I found such statement:

Converting between TensorFlow tensors and Numpy arrays can be expensive and can lead to memory leaks if not managed properly.

Which would be bad news for me - in my project I have quite a lot of functions using Numpy, including tf.Tensor -> np.ndarray conversions and vice versa.

So, my question is what is the underlying mechanism of the problem and how - if possible - to use Numpy inside the TensorFlow code without memory leaks (and without discarding graph execution)?

Edit: Here are some Numpy functions I use between epochs:

  • np.zeros()
  • np.array()
  • flatten()
  • np.unique()
  • np.union1d()
  • np.sum()
  • np.logical_and()
  • np.logical_not()
  • np.mean()

Moreover, I use numpy() in some places.

0

There are 0 best solutions below