Support for pre-quantized model int8/uint8 conversion

Hi,

Does QNN support int8 --> uint8 or uint8 --> int8 pre-quantized model conversion? If no, is there a plan to support it?

Tag @anijain2305 cause you are fantastic! Thank you!

Hi @JoeyChou I am not sure what you mean by int8 -> uint8 conversion.

If you want your conv2d and dense inputs and weights to be of specific data type, yes that is certainly possible with QNN Legalize pass. An example of this is for Intel VNNI instructions which prefer uint8 datatypes for feature maps and int8 for the weights. Naturally, the pre-quantized models might not follow this rule. So, QNNLegalize inserts requantize node before the conv2d and dense to satisfy the datatype restrictions.

Please look at an example here

Hi @anijain2305 thanks for the reply. I should’ve made myself clear. What I meant was if the model(weight and bias) was quantized to uint8, does TVM has a way to convert the uint8 weight and bias to int8 weight and bias?

I will certainly try what you suggested, thank you.

Yes, it does. The legalize pass can do this.

1 Like

Yes, really appreciate your help!