INT8 Quantization - Code generation for backends

thierry · July 26, 2018, 3:36am

Similar observation as in your other post - it might be good to think about supporting arbitrary precision operators at the graph level since there is work on supporting bit-serial convolutions on RPi, and also quantized operators in FPGAs, accelerators etc.

The challenge with arbitrary precision is data layout, and specifically how to pack data into standard 8bit/32bit words.

I’m happy to discuss some ideas, and present some concrete scenarios down the road as we continue our work on quantization with VTA.