I want to ask if is it true that the receptive field of swin transformer is just in the local window where we compute the self-attention? And is there any way to increase the receptive field when using swin transformer?
I know that when we use consecutive convolution layers, we will get a bigger receptive field as result. So, can we do the same with swin transformer?