RetinaNet feature maps dimensional issue

38 Views Asked by Dinis Rodrigues At 17 August 2025 at 13:20

I've been reading a lot about object detection and specifically on RetinaNet. But the implementation in this part is not that clear to me.

it's said, the feature maps from all pyramid levels are passed to the weight shared sub-networks for classification and bounding box regression.

But how come this is possible, when the weights of the sub-networks are shared across all pyramid levels? The output would be of a different dimension, because from my understanding, the last layer of each sub-networks is fully connected to the output, if I'm not mistaken. In the original paper it's not clarified. Is there some zero padding happening here?

In the Faster-RCNN architectures, ROI pooling layer is applied to address this dimensional issue, but in this case I'm lost..

Original Q&A

There are 1 best solutions below

hkchengrex On 17 December 2020 at 10:30

All the subnetworks are fully-convolutional (with standard zero-padding). They don't care about the image dimension (height and width).

The channel dimension is kept the same through the FPN structure. That part is not weight-shared.

RetinaNet feature maps dimensional issue

There are 1 best solutions below

Related Questions in COMPUTER-VISION

Related Questions in OBJECT-DETECTION

Related Questions in RETINANET

Trending Questions

Popular # Hahtags

Popular Questions