Ajja
June 13, 2019, 2:14pm
1
Hi, @Laurawly I am trying to understand OpenCL’s scheduler. I’ve read OpenCL AutoTVM questions - #8 by sebap - Development - Apache TVM Discuss but still cannot figure out a couple of stuff. I’d appreciate some answers:
In nchw scheduler, do you convert the kernel to NCHW16c?
Why is kernel, in kernel_vec operation, divided into blocks by first axis, num_filters and not second one, channel?
What’s the purpose of channels in convolution operation? When could it be useful? (in python/relay/op/nn.py )
How was this if statement created? Are these numbers from a specific iGPU? If yes, which one?
block_w = 1
block_h = 1
if stride_h == 2:
if num_filter + kernel_h == 515:
block_h = 4
block_w = 4
else:
block_h = 4
block_w = 5
elif kernel_h == 3:
if num_filter == 512:
block_h = 2
block_w = 7
else:
block_h = 2
block_w = 14
elif kernel_h == 7 and padding == 3 and stride == 1:
block_h = 3
block_w = 4
else:
block_h = 1
block_w = 16
Ajja
June 17, 2019, 1:46pm
2
I’d really appreciate some answers.
Thanks.
Ajja
June 21, 2019, 1:37pm
4
Thank You, @Laurawly very much for the answer. Based on your post, I got a couple more questions:
I think we were talking about different parts of the code. My concern about NCHW16c was in this part.
if not out_width % block_w == 0:
c_w = (out_width // block_w + 1) * block_w
if not out_height % block_h == 0:
c_h = (out_height // block_h + 1) * block_h
pad_before = [0, 0, pad_top, pad_left]
pad_after = [0, 0, pad_down + c_h - block_h, pad_right + c_w - block_w]
temp = pad(data, pad_before, pad_after, name="pad_temp")
nv = 16
if not num_filter % nv == 0:
num_filter = (num_filter // nv + 1) * nv
out_channel = num_filter
cshape = (batch, out_channel // nv, c_h, c_w, nv)
kvshape = (num_filter // nv, channel, kernel_h, kernel_w, nv)
kernel_vec = tvm.compute(
kvshape,
lambda co, ci, kh, kw, vc:
I thought that by dividing the out_channel by nv you are trying to convert the data to nchwc format but now I see that you are dividing output channels. Are you creating subgroups in this part? But why are you dividing this in compute, not the scheduler using split method? What difference does it make to divide this there?
I read your comment about alter_layout function and don’t quite understand how it works. It isn’t used in AutoTVM because it is enabled only when opt_level = 3
, is it?
----------
target: Target
The current target
workload : Workload
The current workload.
cfg : ConfigSpace
The specific configuration.
Note
----
This interface is for cases when TVM decides to replace an operator in the graph.
For example, `AlterOpLayout` pass (enables when `opt_level = 3`) replaces `NCHW`
convolution with `NCHW[x]c` implementation on x86 CPUs.
Thus in TOPI, we first query schedule using original `NCHW` workload,
then update the dispatcher with the new `NCHW[x]c` workload.
So that later on, `NCHW[x]c` convolution can get schedule from the dispatcher using
its own workload directly.
.. code-block:: python
@conv2d_alter_layout.register("cpu")
So, do you use it to replace conv2d’s compute (e.g.https://github.com/dmlc/tvm/blob/master/topi/python/topi/intel_graphics/conv2d.py#L323 ) with https://github.com/dmlc/tvm/blob/master/topi/python/topi/intel_graphics/conv2d.py#L55 ? If yes, then, how does this alter_layout method exactly work? Does it contain some implicit conversion of the input from certain data layout to nchwc and conversion back to that data layout?
Ajja
July 3, 2019, 2:16pm
6
Thank You, @Laurawly , for the answer. Could you also explain to me why block_h and block_w is used to add more bottom and right padding?
c_h = out_height
c_w = out_width
if not out_width % block_w == 0:
c_w = (out_width // block_w + 1) * block_w
if not out_height % block_h == 0:
c_h = (out_height // block_h + 1) * block_h
pad_before = [0, 0, pad_top, pad_left]
pad_after = [0, 0, pad_down + c_h - block_h, pad_right + c_w - block_w]
temp = pad(data, pad_before, pad_after, name="pad_temp")
nv = 16
if not num_filter % nv == 0:
num_filter = (num_filter // nv + 1) * nv
out_channel = num_filter
cshape = (batch, out_channel // nv, c_h, c_w, nv)
kvshape = (num_filter // nv, channel, kernel_h, kernel_w, nv)
Is it connected with zero-padding explained here ?