Does relay build with byoc trt support int8

Python3 · July 1, 2021, 7:18pm

Hi friends: I have a question. Does current relay build with byoc tensorrt support INT8 model? Are there some code examples? Thank you very much! Best

FrozenGene · July 2, 2021, 2:36am

As far as I know, we currently doesn’t support it. @trevor-m @comaniac maybe could provide more information

Python3 · July 2, 2021, 9:40am

I saw someone did this: int8 calibration · trevor-m/tvm@d159105 · GitHub. Not sure whether we can merge it.

comaniac · July 2, 2021, 4:34pm

AFAIK, TVM-TRT now only supports fp32 and fp16. Not sure about the int8 support tho. cc @trevor-m

trevor-m · July 6, 2021, 5:06pm

We currently don’t support int8. I was working on that branch that you linked, but it is a bit outdated now. I might be able to update it and create a PR soon

Python3 · July 6, 2021, 5:39pm

great! Thank you !!! I think it will be really useful if we can support int8 since many quantized models are in int8 mode.

Python3 · July 17, 2021, 6:33am

Hi: may I ask whether you have some progress in this topic? If not, I may can help to solve this problem if I have time in this week.

FrozenGene · July 21, 2021, 1:15am

According to reply, I think there are 2 things talking about. 1: Support TRT int8 calibration 2. Make TRT accept int8 quantized model (like accepting ONNX QLinearConv operator). I think this is different things. @trevor-m The PR you mentioned is the first one? If I don’t understand wrongly.

Python3 · July 21, 2021, 5:24am

Yes, what I mean is the first one.