I'm using Resnet18 provided by pytorch to do my homework. The input image size is 3x64x64, divided into 100 classes, so the teacher ask us to modify the stride and padding to make the last feature map size to be 8x8x512. I changed to stride of the maxpooling layer, the first conv layer of the 3rd conv block and the fully connected layer like this:

model = models.resnet18(pretrained=True)
model.maxpool = MaxPool2d(kernel_size=3, stride=1, padding=1, dilation=1, ceil_mode=False)
model.layer3[0].conv1 = conv.Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
model.fc = nn.Linear(model.fc.in_features, 100)

The expected feature map sizes are like this: https://i.stack.imgur.com/2AfhS.png

When I use torchsummary.summary to see the feature map sizes to see whether I'm right, I encountered the error:

RuntimeError: The size of tensor a (16) must match the size of tensor b (8) at non-singleton dimension 3

What can be the reason? I think the stride, padding and channels are set correctly.

1

There are 1 best solutions below

0
On

I've found the reason by myself. The problem is about the 'out=out+identity' part. I changed the stride of conv layer of conv block 3, but I didn't change the stride of the conv layer of downsampling block, so the feature map size of downsampling and conv layer of conv block 3 will be different. It's a very subtle question, but it still cost me so much time. I just hope no one will get stuck here for a long time just like me.