Hi,
I have been try to test a quantization example described in the following post:
https://tvm.ai/2019/04/29/opt-cuda-quantized.html
However, that machine that I am using for now has a GPU with a CUDA Compute Capability of 3.0, which is not enough for the dp4a
instruction.
Is there any other example(s) about automatic quantization that does not use the dp4a
instruction on a GPU. Or maybe examples targeting x86. My goal for now is getting started with TVM and its quantization pass.
I would appreciate if someone could provide me with some of this quantization code examples.
Thanks!