Hi friends: I have a question. Does current relay build with byoc tensorrt support INT8 model? Are there some code examples? Thank you very much! Best
As far as I know, we currently doesn’t support it. @trevor-m @comaniac maybe could provide more information
I saw someone did this: int8 calibration · trevor-m/tvm@d159105 · GitHub. Not sure whether we can merge it.
AFAIK, TVM-TRT now only supports fp32 and fp16. Not sure about the int8 support tho. cc @trevor-m
We currently don’t support int8. I was working on that branch that you linked, but it is a bit outdated now. I might be able to update it and create a PR soon
great! Thank you !!! I think it will be really useful if we can support int8 since many quantized models are in int8 mode.
Hi: may I ask whether you have some progress in this topic? If not, I may can help to solve this problem if I have time in this week.
According to reply, I think there are 2 things talking about. 1: Support TRT int8 calibration 2. Make TRT accept int8 quantized model (like accepting ONNX QLinearConv operator). I think this is different things. @trevor-m The PR you mentioned is the first one? If I don’t understand wrongly.
Yes, what I mean is the first one.