I am trying to use PyTorch's nn.Conv3d for convolutional autoencoders in a system with AMD GPUs. We have the latest ROCM (4.5) and MIOpen (2.14). The same training script works with NVIDIA GPUs. I managed to get the same training with nn.Conv2D, but, for Conv3D, I get this error:

return forward_call(*input, **kwargs)
  File ".../lib/python3.9/site-packages/torch/nn/modules/conv.py", line 587, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File ".../lib/python3.9/site-packages/torch/nn/modules/conv.py", line 582, in _conv_forward
    return F.conv3d(
RuntimeError: miopenStatusUnknownError
MIOpen Error: /MIOpen/src/ocl/convolutionocl.cpp:831: Forward Convolution cannot be executed due to incorrect params

here is the network:

class autoencoder(nn.Module):
    def __init__(self):
        super(autoencoder, self).__init__()
        self.conv_en = nn.Conv3d(in_channels=3, out_channels=32, kernel_size=3, stride=1, padding=1)

    def forward(self, inp_x):
        x = self.conv_en(inp_x)

here is the training loop:

for inputs, labels in train_loader:
    inputs = inputs.permute(0,2,1,3,4).to(torch.device('cuda'))
    predictions = distrib_model(inputs)

Any ideas?

1

There are 1 best solutions below

0
On

FYI, this issue seems to be fixed with ROCM 5.1.1. The changelog can be found here.