[TensorIR] Loop transformations which breaks dependency precedence condition is not properly detected

wrongtest · November 25, 2021, 6:45am

Hi All,

The following script depicts the problem

@T.prim_func
def func(data: T.handle) -> None:
    X = T.match_buffer(data, [4, 4], dtype="int32")
    for i, j in T.grid(4, 4):
        with T.block("block"):
            vi, vj = T.axis.remap("SS", [i, j])
            with T.init():
                X[vi, vj] = 0
            X[vi, vj] = T.if_then_else(vi > 0 and vj > 0, X[vi-1, vj-1], 0, dtype="int32") + \
                        T.if_then_else(vi > 0 and vj < 3, X[vi-1, vj+1], 0, dtype="int32") + 1

# direct test
f1 = tvm.build(func, "llvm")
a = tvm.nd.array(np.ones([4, 4], "int32"))
f1(a)
# reorder (i,j) -> (j,i), which should not be valid 
ai, aj = s.get_loops(s.get_block("block"))
s.reorder(aj, ai)
f2 = tvm.build(s.mod["main"], "llvm")
b = tvm.nd.array(np.ones([4, 4], "int32"))
f2(b)
# get incorrect results
tvm.testing.assert_allclose(a.numpy(), b.numpy())

For this example, since the precedence conditions (i, j) <- (i-1, j-1), (i, j) <- (i-1, j+1) should be preserved, the reorder should not be allowed. However, current schedule primitives do not detect this invalid operation.

It should be ok for developers manually performing the schedule since they are assumed to “know what you are doing”. But it would be too hard to debug in applications of somewhat “auto” fashion, where transformations are automatically done during searching and tuning.

junrushao · November 26, 2021, 6:39am

I see, so there is loop-carried dependency. Is my understand correct?

wrongtest · November 26, 2021, 6:57am

Yes, exactly:)

This example is just something like 1D-stencil, seems often get talked about in a general loop optimization context.

junrushao · November 27, 2021, 4:45am

Yeah it’s an interesting case. I remember that we explicitly disallowed scheduling with loop-carried dependency @Hzfengsy @spectrometerHBH

Hzfengsy · December 2, 2021, 10:15am

In this case, vi and vj is not spatial axes (which can be executed parallel without dependency). TensorIR will not and CANNOT check the block iterator type.

wrongtest · December 2, 2021, 10:26am

Thanks, if I define axis like

vi = T.axis.opaque(4, i)
vj = T.axis.opaque(4, j)

The illegal reorder schedule will show

Error message: The block tir.Block#0 under the loops to be reordered have block iter type other than data-parallel or reduction