NotImplementedError: The following operators are not implemented: {'FakeQuantWithMinMaxVars'}

Hi @RussellRao, happy to help :).

When you export a model as tflite (with tflite converter perhaps) it automatically changes the FakeQuantWithMinMaxVars nodes into a pairs of Quantize -> Dequantize nodes. This can be directly imported into TVM where they get imported as qnn.quantize -> qnn.dequantize nodes.

Then we can run the FakeQuantizationToInteger pass. This pass will replace the floating point ops with fixed_point variants using the quantization params from the surrounding qnn.dequantize and qnn.quantize.

For example, if you have a subgraph of qnn.dequantize -> nn.conv2d -> qnn.quantize, this gets replaced into a qnn.conv2d op. Here the original nn.conv2d was working with floating point operations and the new qnn.conv2d works with fixed point using the quant params taken from the surrounding qnn.dequantize and qnn.quantize.

1 Like