Hi,
I am interested in understanding the tvm generated convolution inference code for GPU.
Especially I want to understand if the code does direct convolution or some approximation? There are quite some constants used to tweak the original values when performing data packing and weight packing. For only changing data layout those constants may not be necessary, so I wonder why those constants are used, like -2.5, 5e-1 etc in the following code snippets? This opencl code is specifically for convolution in vgg network layer 2 on GPUs.
Thanks!
data_pack_local[1] = 0.000000e+00f; data_pack_local[1] = (data_pack_local[1] + d[1]); data_pack_local[1] = (data_pack_local[1] + (d[2] * -2.500000e+00f)); data_pack_local[1] = (data_pack_local[1] + (d[3] * 5.000000e-01f)); data_pack_local[1] = (data_pack_local[1] + d[4]); data_pack_local[1] = (data_pack_local[1] + (d[7] * -1.500000e+00f)); data_pack_local[1] = (data_pack_local[1] + ((d[8] * -1.500000e+00f) * -2.500000e+00f)); data_pack_local[1] = (data_pack_local[1] + ((d[9] * -1.500000e+00f) * 5.000000e-01f)); data_pack_local[1] = (data_pack_local[1] + (d[10] * -1.500000e+00f)); data_pack_local[1] = (data_pack_local[1] + (d[13] * -2.000000e+00f)); data_pack_local[1] = (data_pack_local[1] + ((d[14] * -2.000000e+00f) * -2.500000e+00f)); data_pack_local[1] = (data_pack_local[1] + ((d[15] * -2.000000e+00f) * 5.000000e-01f)); data_pack_local[1] = (data_pack_local[1] + (d[16] * -2.000000e+00f)); data_pack_local[1] = (data_pack_local[1] + (d[19] * 1.500000e+00f)); data_pack_local[1] = (data_pack_local[1] + ((d[20] * 1.500000e+00f) * -2.500000e+00f)); data_pack_local[1] = (data_pack_local[1] + ((d[21] * 1.500000e+00f) * 5.000000e-01f)); data_pack_local[1] = (data_pack_local[1] + (d[22] * 1.500000e+00f)); data_pack_local[1] = (data_pack_local[1] + d[25]); data_pack_local[1] = (data_pack_local[1] + (d[26] * -2.500000e+00f)); data_pack_local[1] = (data_pack_local[1] + (d[27] * 5.000000e-01f)); data_pack_local[1] = (data_pack_local[1] + d[28]);