Combining Separate tir.Block using compute_at()

Thanks for asking!

To be clear, it is not necessary to break our block isolation when using compute-at. For example, after compute-at, the IR may become:

for i in tir.range(0, 128):
    for j in tir.range(0, 128):
        with tir.block([128, 128], "A_Block") as [vi, vj]:
            A[vi, vj] = tir.float32(0)
        with tir.block([128, 128], "B_Block") as [vi, vj]:
            B[vi, vj] = A[vi, vj]

1 Like