I'm trying to convert a pre-trained model to onnx format. I'm using tf2onnx.convert for this purpose. Command that I ran:
$ python3 -m tf2onnx.convert --saved-model models --output tf_model_op9.onnx
On executing the command, I get OOM issue and the process is killed like this:
2021-06-10 20:45:45.363569: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 984 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2021-06-10 20:45:46,335 - INFO - Computed 2 values for constant folding
Killed
On checking /var/log/kern.log i get:
Jun 10 21:01:36 dreamvu-desktop kernel: [559821.101983] Out of memory: Kill process 27888 (python3) score 501 or sacrifice child
Jun 10 21:01:36 dreamvu-desktop kernel: [559821.102503] Killed process 27888 (python3) total-vm:18059264kB, anon-rss:3788464kB, file-rss:126752kB, shmem-rss:0kB
Jun 10 21:01:36 dreamvu-desktop kernel: [559822.232634] oom_reaper: reaped process 27888 (python3), now anon-rss:0kB, file-rss:127808kB, shmem-rss:0kB
Most of the solutions I find are to limit batch_size(already 1), gpu resources using sessions(already tried) or change number of threads on cpu or change memory limit(not supported even in tf v2.5). I think I need to limit the RAM being used.
How do I do that?
OS : ubuntu 18.04 || Memory : 7.6 GiB
Graphics : NVIDIA Tegra Xavier (nvgpu)/integrated
Processor : ARMv8 Processor rev 0 (v8l) × 6
Have you considered using a swapfile to provide the extra memory you need ? (assuming you have disk to do so )
as root, or with sudo you would need to:
use the free command to confirm you have the extra memory available as swap