I have tried to quantize tf.keras.applications.DenseNet121 with both TVM and TFLite. TVM’s int8 quantization doesn’t seem to cause a significant loss in accuracy but TFLite’s int8 quantization gives a >6% loss in accuracy. Considering TVM uses symmetric quantization and TFLite uses asymmetric quantization, I expected TFLite would give better quantization. However, the opposite is true.
Do you find the tflite quant model will get different result between run tflite and run TVM?
I didn’t find the results from tflite interpreter and TVM complied model different given the same tflite quant model if that’s what you mean. I found out the reason sometimes TVM’s quantization is better than TFLite’s quantization is because TVM leaves the first conv layer and dense layers unquantized by default. However, in some cases such as tf.keras.applications.MobileNetv2, TVM’s quantization leads to a very bad accuracy (top 1 acc < 0.01). I don’t know why that happens because the same code works for DenseNet and ResNet.
I see two different quant model.
TVM’s CI use tvm/tests/python/frontend/tflite/test_forward.py (https://github.com/dmlc/web-data/tree/master/tensorflow/models/Quantized)
Another one is:https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet
Are you using the first type model?
Yes. I use the first type of models.
“TVM’s quantization is better than TFLite’s quantization” Do you means TVM’s top1 is better than tflite?
Yes. For some models, TVM’s quantization leads to a better top1 accuracy.
Hi @bwang1991 ,
Do you have these results somewhere from when you benchmarked these models?
Thanks & Regards,
Kuladeep.