Also, as part of the standardization of QNN, we could ensure that all QNN “compute” ops go from int8 -> int8
. I believe that qnn.conv2d
is the only QNN op that outputs an accumulation dtype, so we could change qnn.conv2d
to take in bias in addition to the data and weight.