[RFC][Tensorcore] INT4 end-to-end inference

So you want to run on CPU instead of GPU? What’s the error you are seeing after changing the target? The inference script in my repo won’t work because I think some of the convolution layouts are not supported yet in CPU x86 computation. If you want to run the quantized NNs on CPU, you may need to tweak the data layout depending on what’s supported now.