Compute like below:
def filter_compute(dtype, M, N, RD):
A = tvm.placeholder((M, N), name='A', dtype=dtype)
rx = tvm.reduce_axis((0, RD), name='rx') ry = tvm.reduce_axis((0, RD), name='ry') H_RD = RD//2 D = tvm.compute((M, N), lambda i,j: tvm.sum(A[i-H_RD+ry, j-H_RD+rx],axis=[ry, rx]), name='D') return [A, D]
The generate lower code(M=1080,N=1920,RD=5):
produce D {
for (i, 0, 1080) {
for (j, 0, 1920) {
D[((i*1920) + j)] = (uint8)0 for (ry, 0, 5) { for (rx, 0, 5) { D[((i*1920) + j)] = (D[((i*1920) + j)] + A[(((((i*1920) + (ry*1920)) + j) + rx) - 3842)]) } } }
} }
The above code will exceed A’s border and cause memory over access error.
So how could tvm comupte with safe border checking automatically without using pad and other hand-checking?.