I am struggling to decipher the status of quantization in TVM. Some posts I have seen suggest that TVM can perform quantization, but it is not obvious how to do this.
The API contains tvm.relay.quantize (https://github.com/apache/tvm/blob/main/python/tvm/relay/quantize/quantize.py) but this is not documented as far as I can tell. Does it work? What are its limitations?
Can anyone provide an overview of the status of quantization in TVM?