Is it possible to represent vectors in TVMScript?

elenkalda-arm · March 30, 2023, 10:09am

I tried the naive thing:

@T.prim_func
def my_module(A: T.Buffer((24,), "float32"), B: T.Buffer((24,), "float32")):
    T.func_attr({"from_legacy_te_schedule": True, "global_symbol": "my_module", "tir.noalias": True})
    C = T.allocate([24], "float32", "global")
    for i_outer in range(6):
        C_1 = T.Buffer((24,), data=C)
        A_1 = T.Buffer((24,), data=A.data)
        B_1 = T.Buffer((24,), data=B.data)
        C_1[i_outer * 4:i_outer * 4 + 4] = A_1[i_outer * 4:i_outer * 4 + 4] + B_1[i_outer * 4:i_outer * 4 + 4]

but that gave me an error:

TypeError: unsupported operand type(s) for +: 'BufferRegion' and 'BufferRegion'

By the way, the TVMScript above is a result of lowering TE, running some TIR passes on it and scripting the result:

type = "float32"
num = 24

A = te.placeholder((num,), dtype=type, name="A")
B = te.placeholder((num,), dtype=type, name="B")
C = te.compute((num,), lambda i: A[i] + B[i], name="C")
s = te.create_schedule([C.op])
x, = s[C].op.axis
xo, xi = s[C].split(x, 4)
s[C].vectorize(xi)

out = schedule_to_module(s, [A, B], "my_module")
out = tvm.tir.transform.StorageFlatten(64)(out)  # to get rid of https://github.com/apache/tvm/issues/14342
out = tvm.tir.transform.VectorizeLoop()(out)
print(out.script())

Let me know what’s the correct way of expressing vectors in TVMScript!

wrongtest · March 30, 2023, 2:52pm

@T.prim_func
def func(A: T.Buffer(10, "float32"), B: T.Buffer(10, "float32"), C: T.Buffer(10, "float32")):
    C[0:10] = A[0:10] + B[T.ramp(0, 1, 10)]

Could you try like this?

elenkalda-arm · April 3, 2023, 9:44am

That worked, thank you! So I suppose at least one of the multielement buffer accesses always has to be explicitly declared as a Ramp node for it to work? Out of interest, is there a reason why it is like this, i.e.

    @T.prim_func
    def func(A: T.Buffer(10, "float32"), B: T.Buffer(10, "float32"), C: T.Buffer(10, "float32")):
        C[0:10] = A[0:10] + B[0:10]

does not work?

wrongtest · April 3, 2023, 10:35am

I think it is a pending issue on vector sugar of the parser. A[0:10] in the script may represent either buffer access region or buffer loading, currently a BufferRegion object is generated first and we do not implement __add__ for BufferRegion yet.

elenkalda-arm · April 3, 2023, 12:25pm

I see, thanks for the clarification

wrongtest · April 21, 2023, 8:23am

https://github.com/apache/tvm/pull/14693 could solve it now!