Error while compiling quantized PyTorch model

Hello, I was trying to compile a quantized PyTorch MobileNet_V2 model and got the error:

Traceback (most recent call last):
  File "compile_jit_quantized_model.py", line 46, in <module>
    torch_relay, relay_parameters = tvm.relay.frontend.from_pytorch(quantized_traced_model, [("input", [1, 3, 224, 224])])
  File "/Users/alopez/Documents/Code/tvm/python/tvm/relay/frontend/pytorch.py", line 3324, in from_pytorch
    qnn_torch.add_input_quant_params_to_op_inputs(graph)
  File "/Users/alopez/Documents/Code/tvm/python/tvm/relay/frontend/qnn_torch.py", line 334, in add_input_quant_params_to_op_inputs
    scale, zp = _get_quant_param_for_input(node.inputsAt(i))
  File "/Users/alopez/Documents/Code/tvm/python/tvm/relay/frontend/qnn_torch.py", line 164, in _get_quant_param_for_input
    return dfs(input_value.node())
  File "/Users/alopez/Documents/Code/tvm/python/tvm/relay/frontend/qnn_torch.py", line 159, in dfs
    return dfs(arg.node())
  File "/Users/alopez/Documents/Code/tvm/python/tvm/relay/frontend/qnn_torch.py", line 152, in dfs
    scale = current_node.inputsAt(indices[0])
RuntimeError: ArrayRef: invalid index Index = 6; Length = 6

I know that officially TVM only supports PyTorch 1.4 but I have worked with non-quantized scripts in version 1.6 so not sure if its related to versioning or not. For ease of checking I am attaching the source script, but you will need to change the calibration images.

import torch, torchvision
from PIL import Image
import numpy as np
import tvm, tvm.relay

IMAGES_FILENAMES = [
    "../support/resources/images/imagenet_images/raw/IMG_9448.JPG",
    "../support/resources/images/imagenet_images/raw/elephant-299.jpg",
    "../support/resources/images/imagenet_images/raw/person.jpg",
    "../support/resources/images/imagenet_images/raw/table.jpg",
]


def get_calibration_image(filename):
    image = Image.open(filename)

    preprocess = torchvision.transforms.Compose([
        torchvision.transforms.Resize([224, 224]),
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ])

    return np.expand_dims(preprocess(image), 0)


def calibrate_model(model, calibration_images_filename): # calibration callback
    with torch.no_grad():
        for image_filename in calibration_images_filename:
            print(f"Calibrating with {image_filename} ...")
            model(torch.tensor(get_calibration_image(image_filename)))


model = torchvision.models.mobilenet_v2(pretrained=True).eval()

traced_model = torch.jit.trace(model, torch.randn(1,3,224,224))

qconfig = torch.quantization.qconfig.get_default_qconfig("fbgemm")

quantized_traced_model = torch.quantization.quantize_jit(
    traced_model,
    {"": qconfig},
    calibrate_model,
    [IMAGES_FILENAMES]
)

torch_relay, relay_parameters = tvm.relay.frontend.from_pytorch(quantized_traced_model, [("input", [1, 3, 224, 224])])

with tvm.transform.PassContext(opt_level=3):
    compiled_torch = tvm.relay.build(torch_relay,
                                     target="llvm",
                                     target_host="llvm",
                                     params=relay_parameters)

compiled_torch.export_library("compiled_mobilenetv2.so")A

Any pointers will be greatly appreciated.

Thanks!

We don’t support graph mode quantization yet. Last time I checked, it was fairly unstable.

I’ve only tried importing eager-mode quantized models, for which we have a good support. But I must adimit, eager-mode quantization workflow is not great (it requires rewriting a model etc), so having support for graph mode quantization is certainly desirable.

Thank you! I’ll take a look at the eager-mode quantization workflow and see if its something I can use.

If you want to use quantized mobilenet v2, you can have a look at our test, here https://github.com/apache/incubator-tvm/blob/main/tests/python/frontend/pytorch/qnn_test.py#L371

1 Like