Why is tvm.relay.quantize not documented? // Status of quantization in TVM

tomhepworth · August 6, 2024, 10:54am

I am struggling to decipher the status of quantization in TVM. Some posts I have seen suggest that TVM can perform quantization, but it is not obvious how to do this.

The API contains tvm.relay.quantize (https://github.com/apache/tvm/blob/main/python/tvm/relay/quantize/quantize.py) but this is not documented as far as I can tell. Does it work? What are its limitations?

Can anyone provide an overview of the status of quantization in TVM?