Bug in tensorize?

Hi, When I tried below tensorize, just do array shift:

def intrin_pool(l):

A = tvm.placeholder((l,), name='AAA')

pool = tvm.compute((l-1,),
                lambda i: A[i + 1], name='p')

def intrin_func(ins, outs):
    dinp = ins[0]
    dout = outs[0]
    return tvm.call_packed("op", dinp, dout)

with tvm.build_config(offset_factor=1):
    return tvm.decl_tensor_intrin(pool.op, intrin_func)

l = 64 A = tvm.placeholder((l,), name=‘A’)

P = tvm.compute((l-1,), lambda i: A[i + 1], name=‘p’) s = tvm.create_schedule(P.op) intrin = intrin_pool(l) s[P].tensorize(P.op.axis[0], intrin) print(tvm.lower(s, [A, P], simple_mode=True))

But it complains:

tvm._ffi.base.TVMError: [10:54:35] /home/dixing/tvm/src/op/tensorize.cc:317: Check failed: Equal(lhs, rhs) Failed to match the compute with TensorIntrin tensor_intrin’s declaration
provided= AAA(i), intrin= AAA((1 + i))

The computation expression is exact the same, but it seems that there is some optimization makes it different when perform tensorize. Is this a bug?

dixing

I am not sure if it is a bug.

I can find two reasons for this error.

Code produces following HalideIR:

produce p {
for (i, 0, 63) {
p[i] = a[(i + 1)]
}
}

in here, input a has range [min = 1, extent = 63]. while p has [0, 62].

During tensorization TVM tries to normalize all the ranges to have min of zero and modifies corresponding expressions in the code.

In this example, AAA(i+1) from range(1,63) becomes AAA(i) from range(0, 62). That is why in the error it says, provided expr is AAA(i).

Secondly, in the tensorize declaration, AAA(i+1) is canonically simplified to AAA(1+i), before matching.

1 Like

Thanks for your explaining. But why tensorization always needs to have zero min? This will lose all index information. Maybe it wants to ignore some of the index/offset info when there is tiling/split, but is it suitable to ignore all the index in any cases? I’m not sure if I understand the reason correctly.

1 Like