GluonCV object detection fine-tuning - Select which layers are modified (freeze the rest)

272 Views Asked by At

I have a question about the procedure for fine-tuning a pre-trained object detection model with GluonCV, described in this tutorial.

As far as I understand, the described procedure modifies all the weight values in the model. I wanted to only fine-tune the fully connected layer at the end of the network, and freeze the rest of the weights.

I assume that I should specify which parameters I want to modify when creating the Trainer:

trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.001, 'wd': 0.0005, 'momentum': 0.9})

so, instead of net.collect_params(), I should list the parameters I’m interested in training, and run the rest of the process normally. However, I don’t know how to isolate these parameters precisely…I tried printing:

params = net.collect_params()

but, out of this list, I don’t know which ones correspond to the final FC layers. Any suggestions?

1

There are 1 best solutions below

0
On

Let's say we have a pretrained Gluon model for a classification task:

>>> import mxnet as mx
>>> net = mx.gluon.nn.HybridSequential()
>>> net.add(mx.gluon.nn.Conv2D(channels=6, kernel_size=5, padding=2, activation='sigmoid'))
>>> net.add(mx.gluon.nn.MaxPool2D(pool_size=2, strides=2))
>>> net.add(mx.gluon.nn.Flatten())
>>> net.add(mx.gluon.nn.Dense(units=10))
>>> net.collect_params()
hybridsequential0_ (
  Parameter conv0_weight (shape=(6, 0, 5, 5), dtype=<class 'numpy.float32'>)
  Parameter conv0_bias (shape=(6,), dtype=<class 'numpy.float32'>)
  Parameter dense0_weight (shape=(1, 0), dtype=float32)
  Parameter dense0_bias (shape=(1,), dtype=float32)
)

To fine-tune this convolutional network, we want to freeze all the blocks except Dense.

First, recall that collect_params method accepts a regexp string to choose specific block parameters by their names (or prefixes; prefix parameter of Conv2D, Dense, or any other Gluon (hybrid) block). By default, the prefixes are class names, i.e. if a block is Conv2D then the prefix is conv0_ or conv1_ etc. Moreover, collect_params returns an instance of mxnet.gluon.parameter.ParameterDict, which has setattr method.

Solution:

>>> conv_params = net.collect_params('(?!dense).*')
>>> conv_params.setattr('grad_req', 'null')

or simply

>>> net.collect_params('(?!dense).*').setattr('grad_req', 'null')

Here we exclude all the parameters matching dense to get only conv blocks and set their grad_req attributes to 'null'. Now, training the model net with mxnet.gluon.Trainer will update only dense parameters.


It is more convenient to have a pretrained model with separate attributes indicating specific blocks, e.g. the features block, anchor generators etc. In our case, we have a convolutional network that extracts features and passes them to an output block.

class ConvNet(mx.gluon.nn.HybridSequential):
    def __init__(self, n_classes, params=None, prefix=None):
        super().__init__(params=params, prefix=prefix)

        self.features = mx.gluon.nn.HybridSequential()
        self.features.add(mx.gluon.nn.Conv2D(channels=6, kernel_size=5, padding=2,
                          activation='sigmoid'))
        self.add(mx.gluon.nn.MaxPool2D(pool_size=2, strides=2))
        self.add(mx.gluon.nn.Flatten())

        self.output = mx.gluon.nn.Dense(units=n_classes)

    def hybrid_forward(self, F, x):
        x = self.features(x)
        return self.output(x)

With this convnet declaration, we don't have to use regexps to access required blocks:

>>> net = ConvNet(n_classes=10)
>>> net.features.collect_params().setattr('grad_req', 'null')

Gluon CV models follow exactly this pattern. See the documentation of the desired model and choose the blocks you would like to freeze. If the docs are empty, run collect_params to see all the parameters and filter out with regexp the ones to fine-tune and set the returned parameters' grad_req to 'null'.