Hi, When I tried below tensorize, just do array shift:
def intrin_pool(l):
A = tvm.placeholder((l,), name='AAA') pool = tvm.compute((l-1,), lambda i: A[i + 1], name='p') def intrin_func(ins, outs): dinp = ins[0] dout = outs[0] return tvm.call_packed("op", dinp, dout) with tvm.build_config(offset_factor=1): return tvm.decl_tensor_intrin(pool.op, intrin_func)
l = 64 A = tvm.placeholder((l,), name=‘A’)
P = tvm.compute((l-1,), lambda i: A[i + 1], name=‘p’) s = tvm.create_schedule(P.op) intrin = intrin_pool(l) s[P].tensorize(P.op.axis[0], intrin) print(tvm.lower(s, [A, P], simple_mode=True))
But it complains:
tvm._ffi.base.TVMError: [10:54:35] /home/dixing/tvm/src/op/tensorize.cc:317: Check failed: Equal(lhs, rhs) Failed to match the compute with TensorIntrin tensor_intrin’s declaration
provided= AAA(i),
intrin= AAA((1 + i))
The computation expression is exact the same, but it seems that there is some optimization makes it different when perform tensorize. Is this a bug?
dixing