how to confirm if the weights of my pytorch model has been quantized

1.5k Views Asked by ArunJose At 29 September 2021 at 12:08

I was able to successfully quantise a pytorch model for huggingface text classification with intel lpot(neural compressor)

I now have the original fp32 model and quantised int8 models in my machine. For inference I loaded the quantised lpot model with the below code

model = AutoModelForSequenceClassification.from_pretrained('fp32/model/path')
from lpot.utils.pytorch import load  
modellpot = load("path/to/lpotmodel/", model)

I am able to see time time improvement of sorts, But I wanted to confirm if the model weights have been actually quantized and use data types such as int8,fp16 etc, which should be ideally the reason of speed up. I iterate over the model weights and print dtypes of the weights, but I see all weights are of type fp32

for param in modellpot.parameters():
  print(param.data.dtype)

output

torch.float32
torch.float32
torch.float32
torch.float32
torch.float32
torch.float32
torch.float32
..
...

How do I verify if my pytorch model has been quantised?

Original Q&A

There are 1 best solutions below

ArunJose On 11 October 2021 at 04:31

Use print(modellpot) to check whether the model is quantized. For example, Linear layer will be converted to QuantizedLinear layer. Actually, only layers that are supported in PyTorch will be converted into Quantized layer, so not all parameters are int8/uint8.

When the model is printed in the output for each you would be able to see the datatype eg the model output would show dtype as qint8 if int8 quantisation has been performed while printing the model.

how to confirm if the weights of my pytorch model has been quantized

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PYTORCH

Related Questions in QUANTIZATION

Related Questions in INTEL-LPOT

Trending Questions

Popular # Hahtags

Popular Questions