Hello,
I have a model that I quantized in eager mode in PyTorch. I have an implementation that doesn’t use the QuantStub and QuantDestub layers, as I want my model to directly accept inputs of dtype int8/uint8 without the quantization steps. The problem I am facing is that the model now expects inputs of type qint8 or quint8, but that causes issues when importing. Similar problem with the output: dtype is qint8 or quint8, but TVM can’t work with them. Is there a way to deal with this?