Question about conv2d x86 schedule template

The search space for conv2d x86 is defined as

cfg.define_split("tile_ic", in_channel, num_outputs=2)
cfg.define_split("tile_oc", num_filter, num_outputs=2)
cfg.define_split("tile_ow", ow, num_outputs=2, filter=lambda y: y.size[-1] <= 64,
                 policy="verbose")
if is_kernel_1x1:
    cfg.define_knob("tile_oh", [1, 2] if oh > 1 else [1])
else:
    cfg.define_knob("unroll_kw", [True, False]

as shown in

I have seen that “tile_ic” and “tile_oc” are used for the data layout transformation. However, I am not sure where “tile_ow” and “unroll_kw” are used. I also checked nn.con2d_NCHWc in

but didn’t find anything either.

I would really appreciate any information you can provide on this issue

You can search tile_ow in the Github repo for the use cases. For example: https://github.com/apache/incubator-tvm/blob/0cfdecdae99582998dae5c2c3fdfd7a2700f10c0/topi/python/topi/x86/conv2d_avx_1x1.py#L64

1 Like

Thank you for your reply @comaniac. Does that mean that, when regular nn.conv2d is used, tile_ow is not used. And may be used for other type of convolutions such as conv2d_avx_1x1 ?

I think so, because conv2d_avx_1x1 (and some other schedules) shares the same compute function with conv2d.

1 Like

That is interesting, since that would make the search space bigger for

@autotvm.register_topi_compute("conv2d_NCHWc.x86")
def conv2d_NCHWc(cfg, data, kernel, strides, padding, dilation, layout, out_layout, out_dtype):

Since the operation performed for that schedule is nn.conv2d_NCHWc, which does not use tile_ow or unroll_kw.

It’s true, but it should be fine in my opinion, because the search space for X86 workloads are still acceptable in most cases.

1 Like

Hi @comaniac, I have one follow-up question if you are so kind. Do you know why we go from a 4-D to a 6-D vector when it comes to the kernel. I understand the N[C/c]HW[c] transformation, but I am having trouble understanding the one for the kernel (which is not 5-D but 6-D).

Is it perhaps the same idea but with the output filter dimension, since this number is big? followed by a big number in the second fastest dimension given by lower case “c” ?

Because kernel includes the channel of not only input but also output.

1 Like