Int8 quantization on ViT, especially Layernorm

Hi everyone! I’m new to TVM and quantization.

Currently I’m struggling with a problem about quantization on vision transformer, especially the layernorm module. When I did experiements with FakeQuant(academic simulation), I found the input feature of Layernorm(also the output feature of skip-connection) is very sensitive to quantization(Using Post Training Quantization with small calibration set) with 8 bit-width. Accuracy degration could be found on all sorts of ViTs(or DeiTs).

I noticed that somebody has already test TVM+ViTs, but accuracy, as well as Layernorm module are not mentioned, which confused me more.

I’m wondering that how does TVM support Int8 Inference on ViT. Specifically are there any special configuration on quantizing Layernorm or skip-connection?

Thanks!

Hi, I am experiencing quantization errors when import ViT using relay. I am currently unable to start running the model. Can you share your code for importing ViT, e.g. github?Thanks a lot!