Transform layout without layout_transform

The ConvertLayout pass works well but unfortunately leaves layout_transform ops in the graph, which can be a real pain to remove. I used pattern rewrite but I’d like a more generic solution.

More specifically, I’d like to convert a full model (including inputs and outputs) from one layout to another, say NCHW to NHWC. The resulting graph must be clean without layout_transform, just the original ops with tensors in the final layout and weight tensors in corresponding layout eg for convolutions.

Is there a generic way to do that?

I don’t think there is a generic way to do that because it actually changes the model specification. CovertLayout pass focuses on the layout in the graph while preserving its original spec (i.e., the input and output format/layout). If you are willing to change the model spec, the best solution is exactly what you have proposed, IMO.

cc @anijain2305

1 Like

@comaniac is right. ConvertLayout preserves original spec. There will be atleast two layout_transforms, one at the beginning and other at the end, while converting from NCHW to NHWC. In situations, where there are layout transforms in between, I would suggest to take a look at this -

The doc has some explanation on how to add convert layout support for new operators. It might be possible that your graph has some operators that are not yet supported by convert layout.

Try with efficient net. All the ops are supported. In the end, 2 layout_transform remain. One would expect them to be just after the input and just before the output. But it is not the case.

For example, at input, it remains as x - > nn.pad - > layout_transform. Even though pad is supported. So one has to make a pass to change pad layout. At the end, the layout_transform is few ops behind the output but here it can just be removed with another pass.

Even if we respect the model spec, I wonder if it might be a bug that layout_transforms aren’t just at input and output?

I dont want to hijack the thread, but I also have a question w.r.t. ConertLayout transform.

I had to import a keras (NHWC) model onto ONNX (due to some keras operators not implemented in the Relay frontend). This initial translation apparently also converts my network to NCHW. Because of my own operator implementations being described in NHWC (and this is a hard requirement), I decided to apply the ConvertLayout on the ONNX model to get it again to NHWC. My network then starts with a transpose (which was injected by ONNX conversion via keras2onnx) and followed by a layout_transform operator on the inputs (viceversa on outputs).

I guess the reason the two don’t get optimized out is the “preserving the network spec” requirement. Is the best solution to look for this transpose->layout_transform pattern and then just rewrite with an identity ?

At least, that’s what I do. But if we have to write passes for every operator, we should update this pass instead.

The Relay pattern re-write doesnt require an implementation for every operator. It would require to define the patterns of these “corner cases”, in which we are trying to undo a layout transform which was done outside of TVM. But this is from my scenario.

I think yours is somewhat different, since you have layout_transforms preceded by other operators at the inputs and viceversa at outputs. All those extra operators are part of the Relay import/optimization routines.

Agree. The problem is that I see the issue even with standard models like ResNet.

So I’m afraid this isn’t just for corner cases.

@mikeseven You are right about the observation that the layout transform are not exactly at the beginning and the end. More specifically, the first layout transform will be before the first conv2d op. And similarly, the last layout transform will be before the first op that does not support convert layout (like reduce sum, dense etc)

This was never a problem because for traditional CPU/GPU targets, it does not matter much whether nn.pad is running in NCHW or NWHC layout. But, I think it matters in your case

1 Like

Thanks for the clarification. I understand the logic behind these choices now. But practically I fail to see the rationale for the use case.

If you have graph on which you run layout transformation, you want the whole graph to be converted. From a performance point of view, it just doesn’t make sense to still have to do layout transforms, regardless whether layout has an importance for downstream ops.