Hello, I am new to tvm and I was trying to understand how it works. In the convolution tutorial here, right before this line of code:
s[B].bind(bz, block_z)
The loop to fill Apad.shared is the following:
produce Apad.shared {
for (ax0, 0, 16) {
for (ax1, 0, 16) {
for (ax2, 0, 256) {
for (ax3, 0, 256) {
Apad.shared[((((((ax0*16) + ax1)*256) + ax2)256) + ax3)] = tvm_if_then_else(((((1 <= ax0) && (ax0 < 15)) && (1 <= ax1)) && (ax1 < 15)), A[(((((((ax014) + ax1)*256) + ax2)*256) + ax3) + -983040)], 0.000000f)
}
}
}
}
}
After the bind, it becomes the following:
produce Apad.shared {
for (ax0, 0, 3) {
for (ax1, 0, 3) {
for (ax2, 0, 256) {
for (ax3, 0, 256) {
Apad.shared[((((((ax0*3) + ax1)*256) + ax2)256) + ax3)] = tvm_if_then_else((((((1 - ax0) <= (blockIdx.z/14)) && ((blockIdx.z/14) < (15 - ax0))) && ((1 - (blockIdx.z % 14)) <= ax1)) && (ax1 < (15 - (blockIdx.z % 14)))), A[((((((((ax014) + ax1) + blockIdx.z)*256) + ax2)*256) + ax3) + -983040)], 0.000000f)
}
}
}
}
}
How did we move from 16 to 3 in the first two loops? Is the filling of Apad.shared still correct at all?
Thanks,
Giuseppe