I have a tflite file and I want to quantize it.
How to convert TFLite model to quantized TFLite model?
I have a tflite file and I want to quantize it.
How to convert TFLite model to quantized TFLite model?
Please note you'll need the source model to quantise it. It is not doable to quantise a tflite model due to the limitation of its format.
Your source model could be TF saved_model, Keras model instance, or ONNX. You can find all the supported source model formats HERE, e.g. converter.from_keras_model(nn_path)
. For ONNX conversion, please check this tool.
There are various ways for quantisation. The easiest way is to perform post-training-quantisation on the weights only, in which the TF engine applies a min-max quantisation to get the quantisation parameters such as scale and zero points.
I'd suggest checking the documents on the TFLite site: https://www.tensorflow.org/lite/performance/post_training_quantization
The direct codes are:
import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT] # applies PTQ on weights, if possible
tflite_quant_model = converter.convert()
A few extra things: you can also check out the quantisation-aware training (link)and mix precision training (link) in TF to directly have quantised weights, by which you can get the quantised model via simple TFLite conversion.
Depending on whether you used keras or tfhub, you can simply do the following [this is assuming the TF model was not quantization aware]:
You can further refer the link here: https://www.tensorflow.org/lite/performance/post_training_quantization