[RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4)

electriclilies · April 26, 2021, 8:57pm

Also, as part of the standardization of QNN, we could ensure that all QNN “compute” ops go from int8 -> int8 . I believe that qnn.conv2d is the only QNN op that outputs an accumulation dtype, so we could change qnn.conv2d to take in bias in addition to the data and weight.