TVM quantized operators

Hi, I meet the following error when trying to export to TVM a quantized model created using Brevitas (https://github.com/Xilinx/brevitas):

NotImplementedError: The following operators are not implemented: ['quantized::linear_prepack', 'quantized::conv2d_prepack']

The error occurs while executing this line:

mod, params = relay.frontend.from_pytorch(my_model, input_shapes)

It seems these operators are not implemented in TVM. Is it planned to improve the support of quantized operators? Thanks

You are not supposed to see those ops. We expect those ops to be already applied before tracing, rather than at runtime (that’s why you are seeing them).

Maybe you are doing QAT? Can you share your complete repro script?

When using brevitas, you should export to either onnx or torchscript first:

https://xilinx.github.io/brevitas/tutorials/tvmcon2021.html#Export

Or alternatively add a direct tvm export for example implementations of other exporters have a look at:

Hi @masahi, thanks for your answer. I’m doing QAT with Brevitas library.

Thank you also for your answer @cgerum. I’m already exporting to TorchScript first. Actually, the tutorial of TVMCon2021 does not go far enough to reproduce the issue, because the issue appears after the Brevitas export: relay.frontend.from_pytorch(traced_model, ["input", img_size])

The code on the readme of the Brevitas GitHub page (https://github.com/Xilinx/brevitas) also raises the same error. I already mentioned it 2 times on the Brevitas Gitter page but I have no answer.

Here is an easy code to reproduce the issue:

import torch
from tvm import relay
import brevitas.nn as qnn
from brevitas.quant.scaled_int import Int8ActPerTensorFloat, Uint8ActPerTensorFloat, Int8WeightPerTensorFloat, Int8Bias
from brevitas.export import export_pytorch_quant


class BasicQNN(torch.nn.Module):
    def __init__(self):
        super(BasicQNN, self).__init__()

        self.quant_img = qnn.QuantIdentity(act_quant=Uint8ActPerTensorFloat, return_quant_tensor=True)

        self.conv1 = torch.nn.Sequential(
            qnn.QuantConv2d(in_channels=1, out_channels=16, kernel_size=3, stride=2, padding=1, bias=False,
                            weight_quant=Int8WeightPerTensorFloat, output_quant=Int8ActPerTensorFloat,
                            return_quant_tensor=True),
            torch.nn.ReLU()
        )

        self.conv2 = torch.nn.Sequential(
            qnn.QuantConv2d(in_channels=16, out_channels=32, kernel_size=3, stride=2, padding=1, bias=False,
                            weight_quant=Int8WeightPerTensorFloat, output_quant=Int8ActPerTensorFloat,
                            return_quant_tensor=True),
            torch.nn.ReLU()
        )

        self.fc = torch.nn.Sequential(
            qnn.QuantLinear(in_features=32*7*7, out_features=10, bias=False,
                            weight_quant=Int8WeightPerTensorFloat, output_quant=Int8ActPerTensorFloat,
                            return_quant_tensor=False)
        )

    def forward(self, x):
        x = self.quant_img(x)
        x = self.conv1(x)
        x = self.conv2(x)
        x = x.reshape(x.shape[0], -1)
        x = self.fc(x)
        return x


def main():

    # Create pytorch model
    model = BasicQNN().cpu().eval()
    img_size = (1, 1, 28, 28)

    # Export to torchscript
    traced_model = export_pytorch_quant(model, input_shape=img_size)

    # Export to TVM
    mod, params = relay.frontend.from_pytorch(traced_model, ["input", img_size])


if __name__ == "__main__":
    main()

Any update? are you able to reproduce the issue @masahi, @cgerum?

Yes I reproduced it. It looks like export_pytorch_quant only traces the QAT model. To actually prepare a quantized model that we can import, they need to follow steps in https://github.com/apache/tvm/blob/main/gallery/how_to/deploy_models/deploy_prequantized.py#L125-L130.

As I said before, quantized::linear_prepack and quantized::conv2d_prepack are not supposed to be in the input torchscript since they convert fp32 weights into the internal quantized tensor format in PyTorch. We don’t have a way to convert them in TVM.

So please raise an issue to the breitas repo. This is not something we can fix on our side.

Thanks a lot @masahi!