My device need the weights and input layout NHWC,but the pytorch model layout is NCHW.
I want to change weights layout from NCHW to NHWC , and I came up with two ways:
In the TVM Relay,add transform layout before convolution.But this operation is too time consuming, and every time you run the network, you need to transform it again.
A better way is,the parameter layout is converted to NHWC format in advance.But when I artificially transformed the layout in Pytorch model, here were some mistakes:
transform layout before jit.trance()
model = weights_layout_NCHW2NHWnC(model)
model= torch.jit.trace(model, input_data).eval()
The error is :
Given groups=1, weight of size [64, 7, 7, 3], expected input[1, 224, 224, 3] to have 7 channels, but got 224 channels instead
transform layout after jit.trance() before relay.frontend.from_pytorch()
My device is embedded low power devices,so I don’t want these redundant operations to run on my device. I want to preprocess them on the host side as much as possible.
After that, when you call relay.build(...) on the transformed module, parameters will also be transformed to NHWC at compile time. You can check the output of relay.build(...) to see that returned parameters are in NHWC (HWIO to be precise).
desired_layouts = {'nn.conv2d': ['NHWC', 'default']}
# Convert the layout to NCHW
# RemoveUnunsedFunctions is used to clean up the graph.
seq = tvm.transform.Sequential([relay.transform.RemoveUnusedFunctions(),
relay.transform.ConvertLayout(desired_layouts)])
with tvm.transform.PassContext(opt_level=3):
mod = seq(mod)
print(mod)
Is the layout_transform() in the first row to be repeated every time the network is run? Want I mean is it necessary to convert layout a network weights every time you run a network?
It should only be done once.
I found after call relay.build(...) ,the return output parameters shape has been changed .So what is the use of layout_transform() in Relay graph?
You don’t have to worry about layout transform on weight. So ConvertLayout pass inserts layout_transform to input and weights. Since weight is constant, we can do layout_transform(weight) at compile time, and that’s why you see weight shape has been changed after relay.build.
You can use opt_mod, opt_params = relay.optimize(mod, params, target) to see what the optimized graph looks like. There should be no layout_transform(weight). This is what relay.build() uses.
I guess there is currently no inference strategy suitable for NHW16nC in TVM, which will make the inference process difficult if you have to keep your data layout .