I am trying to use PyTorch's nn.Conv3d for convolutional autoencoders in a system with AMD GPUs. We have the latest ROCM (4.5) and MIOpen (2.14). The same training script works with NVIDIA GPUs. I managed to get the same training with nn.Conv2D, but, for Conv3D, I get this error:
return forward_call(*input, **kwargs)
File ".../lib/python3.9/site-packages/torch/nn/modules/conv.py", line 587, in forward
return self._conv_forward(input, self.weight, self.bias)
File ".../lib/python3.9/site-packages/torch/nn/modules/conv.py", line 582, in _conv_forward
return F.conv3d(
RuntimeError: miopenStatusUnknownError
MIOpen Error: /MIOpen/src/ocl/convolutionocl.cpp:831: Forward Convolution cannot be executed due to incorrect params
here is the network:
class autoencoder(nn.Module):
def __init__(self):
super(autoencoder, self).__init__()
self.conv_en = nn.Conv3d(in_channels=3, out_channels=32, kernel_size=3, stride=1, padding=1)
def forward(self, inp_x):
x = self.conv_en(inp_x)
here is the training loop:
for inputs, labels in train_loader:
inputs = inputs.permute(0,2,1,3,4).to(torch.device('cuda'))
predictions = distrib_model(inputs)
Any ideas?
FYI, this issue seems to be fixed with ROCM 5.1.1. The changelog can be found here.