[BYOC][Quantization] Propagate channels-last PyTorch model to TVM without layer_transforms

This is something I’ve also been looking at.

It seems that PyTorch’s ONNX export doesn’t respect model.to(memory_format=torch.channels_last), so exports to NCHW also.

I was looking at the layout transformation passes in the TVM side, which I think would be done offline if the whole model is converted, and the appropriate optimization passes are applied after the conversion.

However, as far as I can see you need to be explicit about which ops you want to transform the layout of. As this post discusses, and as the convert layout docs show, we need to be explicit about nn.conv2d, nn.dense, etc.

As far as I can see there isn’t a single convert_layout(mod, "NHWC") function which does “best effort” conversion of all ops in a model.