Relay quantization: "TVMError: AssertionError: assert isinstance(expr.args[0], _expr.Constant)"

masahi · May 9, 2021, 8:38pm

To workaround the memory explosion issue, you can try calibrate_chunk_by option at

apache/tvm/blob/f681359b2e358f5a5e29880a6cbc5ce8e9fe1419/tests/python/relay/test_pass_auto_quantize.py#L128


            # current_target = None
            relay.quantize.quantize(mod, params, dataset)




def test_calibrate_memory_bound():
    mod, params = testing.synthetic.get_workload()
    dataset = get_calibration_dataset(mod, "data")
    import multiprocessing


    num_cpu = multiprocessing.cpu_count()
    with relay.quantize.qconfig(calibrate_mode="kl_divergence", calibrate_chunk_by=num_cpu):
        relay.quantize.quantize(mod, params, dataset)




def test_calibrate_percentile():
    mod, params = testing.synthetic.get_workload()
    dataset = get_calibration_dataset(mod, "data")
    with relay.quantize.qconfig(calibrate_mode="percentile"):
        relay.quantize.quantize(mod, params, dataset)

For the error, you can try running relay.transform.FoldConstant() before quantize. The weight needs to be a constant to quantize, but it seems it is not in your model.

Not that the existing quantization functionality in TVM is very limited and not actively developed or maintained. There is a new proposal to rework our quantization support in [RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4).