How can I increase the number of mini-batch using the Standard Updater class in Chainer substantially?
In case of PyTorch, I can increase the number of mini-batch substantially.
- Execute loss.backward() every time.
- Execute optimizer.step() / optimizer.zero_grad() once every three times. This effectively increase the number of mini-batch substantially.
Question 1. In case of Chainer, Is it possible to increase the number of mini-batch substantially?
- Execute loss.backward() every time.
- Execute net.cleargrads() / optimizer.update() once every three times. Can this increase the number of mini-batch substantially?
Question 2. In fact, I'm using the StandardUpdater class. Is it possible to increase the number of mini-batch using any of hyper parameters substantially? Or should I make my class that inherits from StandardUpdater class and change the implementation above?
I'm sorry if the questions have already been asked.
I hope any advice.
(Question seems quite old, but I stumbled upon it and wanted to share my solution to the question)
You would basically do it the same way you do it in PyTorch. Unfortunately, the StandardUpdater has neither a hyper-parameter that supports it nor an implementation for "mini-batch updates". But here is my implementation, how I did it (basically as you mentioned in your question: inherit from the StandardUpdater and re-implement the update_core method):
The implementation is quite old (I think for chainer 4 or 5), but I works for me with chainer 7.8 as well. One could update some lines to match the newer implementation of the update_core method, but as I said, it works for me. Hopefully it helps ;)