Can tensorflow consume GPU memory exactly equal to required

181 Views Asked by At

I use tensorflow c++ version to do CNN inference. I already set_allow_growth(true), but it still consume more GPU memory than exactly need .

set_per_process_gpu_memory_fraction can only set an upper bound of the GPU memory, but different CNN model have different upper bound. Is there a good way to solve the problem

1

There are 1 best solutions below

0
On

Unfortunately, there's no such flag to use out-of-the-box, but this could be done (manually):

By default, TF allocates all the available GPU memory. Setting set_allow_growth to true, causing TF to allocate the needed memory in chunks instead of allocating all GPU memory at once. Every time TF will require more GPU memory than already allocated, it will allocate another chunk.

In addition, as you mentioned, TF supports set_per_process_gpu_memory_fraction which specifies the maximum GPU memory the process can require, in terms of percent of the total GPU memory. This results in out-of-memory (OOM) exceptions in case TF requires more GPU memory than allowed.

Unfortunately, I think the chunk size cannot be set by the user and is hard-coded in TF (for some reason I think the chunk size is 4GB but I'm not sure).

This results in being able to specify the maximum amount of GPU memory that you allow TF to use (in terms of percents). If you know how much GPU memory you have in total (can be retrieved by nvidia-smi, and you know how much memory you want to allow, you can calculate it in terms of percents and set it to TF.

If you run a small number of neural networks, you can find the required GPU memory for each of them by running it with different allowed GPU memory, like a binary search and see what's the minimum fraction that enables the NN to run. Then, setting the values you found as the values for set_per_process_gpu_memory_fraction for each NN will achieve what you wanted.