I have fine-tuned the DistilBERT model for text classification. To reduce inference time, I converted the PyTorch model to ONNX and applied quantization to it. However, I want to use the PyTorch model or load the ONNX model with PyTorch for inference. I have tried some libraries like onnx2torch and onnx2pytorch, but I did not find them helpful. Now, I am attempting to convert the ONNX model to PyTorch but I am encountering some errors.
import onnx2torch
import onnx
onnx_model_path = './onnx/bert_opt_quant.onnx'
onnx_model = onnx.load(onnx_model_path)
model = onnx2torch.convert(onnx_model)
Error:
NotImplementedError: Converter is not implemented (OperationDescription(domain='com.microsoft', operation_type='QEmbedLayerNormalization', version=1))
I want to convert onnx model to pytorch or any possible to load onnx model with pytorch for inferences