[ONNX] Relay type checker error when calling relay.frontend.from_onnx on a quantized model

Hello,

Trying to put an ONNX model through relay.frontend.from_onnx to get an IRModule results in the following error:

The Relay type checker is unable to show the following types match.
In particular dimension 0 conflicts: 1 does not match 96.
The Relay type checker is unable to show the following types match.
In particular `Tensor[(96), float32]` does not match `Tensor[(1), float32]`

This error was encountered with the quantized ONNX CaffeNet-int8 model that can be found here: https://github.com/onnx/models/tree/master/vision/classification/caffenet

import onnx
from tvm import relay

model_path = "/path/to/caffenet-12-int8.onnx"
onnx_model = onnx.load(model_path)
shape_dict = {'data_0': (1, 3, 224, 224)}

mod, params = relay.frontend.from_onnx(onnx_model, shape_dict)

ss-2022-01-17T18:02:03

The error seems to be emitted when calling infer_shape from here:

The input to the pooling operation is the output of the QLinearConv which can be seen from the image above and the same error can be generated if infer_shape is called with the output of _qnn.op.requantize as its argument here:

The reference to Tensor[(96), float32] is most likely pointing to w_scale and the subsequently generated requantize_scale since their TensorType is of the value TensorType([96], float32).

cc @AndrewZhaoLuo

You are also welcome to open a github issue.

1 Like

I’ve made a github issue: https://github.com/apache/tvm/issues/10046

Yeah it’s probably an issue with handling per-channel quantization correctly. I’ll have time later in the week or early next week to take a look at the problem.

There is also an existing issue on per channel Q in other onnx op https://github.com/apache/tvm/issues/9908

Probably better to test on end to end Q models with per-channel Q, I’ve never seen such tests for quantized onnx model (only unit tests from onnx which doesn’t have good coverage).

Thank you for taking a look at this.

What should have initially caught my eye was the axis=0 parameter to the requantize operator outlined in the first post which effectively means that it assumes the first axis to be the channel axis, since axis is The channel axis for quantization. as per the docstring.

The following may be of some help:

I’ve tried to set axis=1 (though the channel axis should be somehow queried), since ONNX assumes channel-first layout to be the default, for the requantize mentioned above in addition to all of the quantize/dequantize/requantize calls in the QuantizeLinear, DequantizeLinear and QLinearConv and tested with that.

With the above, trying to parse the models listed below that can be obtained from the ONNX model zoo where the aforementioned CaffeNet model was taken from results in the following:

  • AlexNet - FAILED
    The Relay type checker is unable to show the following types match.
    In particular dimension 0 conflicts: 12288 does not match 256.
    The Relay type checker is unable to show the following types match.
    In particular `Tensor[(256), float32]` does not match `Tensor[(12288), float32]`
    
  • ResNet - PASSED
  • GoogleNet - PASSED
  • SqueezeNet - PASSED
  • ZFNet-512 - PASSED
  • ShuffleNet - PASSED
  • CaffeNet - FAILED
    The Relay type checker is unable to show the following types match.
    In particular dimension 0 conflicts: 12288 does not match 256.
    The Relay type checker is unable to show the following types match.
    In particular `Tensor[(256), float32]` does not match `Tensor[(12288), float32]`
    
  • VGG - PASSED

Otherwise all of the models mentioned produce an error similar to the one mentioned above.

EDIT: Adding direct reference to the quantize/dequantize/requantize calls changed in code:

3 Likes

The second failure appears to be something related to grouped convolutions in qnn, I will take a looky loo

Attempted fix here: https://github.com/apache/tvm/pull/10162

Thanks @AndrewZhaoLuo for the quick fix. Would love to see this PR get merged ASAP.

1 Like