Given a tensor of size [8, 64, 128, 128] (B, CH, H, W), I would like to apply a channelwise 2D Max Pooling Operation over a 2x2x64 region (H, W, CH) with stride of 1, so as to obtain another tensor of size [8, 1, 128, 128]. Does the code below go onto the right direction?
import torch
import torch.nn as nn
torch.manual_seed(0)
B, CH, H, W = 8, 64, 128, 128
x_batch = torch.randn((B, CH, H, W))
max3d = nn.MaxPool3d((64,2,2), stride=1)
x_max = max3d(x_batch)
x_max.shape
In addition, the code above results in [8, 1, 127, 127], but I would like to exactly obtain a tensor of size [8, 1, 128, 128]. I was not able to find the proper padding yet, e.g. by using a padding=(0,1,1), I obtain an output of [8, 1, 129, 129]
Because of your kernel size of 2 (which is asymmetrical), you need to apply asymmetrical padding, which is not innately supported in the MaxPoolXd functions. Therefore, you need to use the ZeroPad2d function which supports this operation: