[RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4)

anijain2305 · April 26, 2021, 10:04pm

Thanks, that makes sense. I was thinking that while calibration, you could use different attributes for simulated_quantize and simulated_dequantize ops. In the callback of calibrating an operator, one can simulate the affine space and argue about scales and zero points. But for capturing real values, you could use the passthrough feature of simulated ops to prevent any error. In this case, qnn.simulated_quantize (passthrough) → nn.conv2d → qnn.simualted_dequantize (passthrough) will work. But, I read your earlier RFC, and I think you are also maintaining the original graph to find the real tensor values without any error if needed. So, it makes sense to me.