Hello,
I notice that the plan to support compute_at()
on Tir is on this issue https://github.com/apache/tvm/issues/7527#
The compute_at()
API in TVM schedules primitives enables the combination of separate stages of computation when possible. This operation would bring performance gain or other potential optimization opportunities. For example, two loops with the exact same looping range can be combined into one loop by compute_at()
.
A = te.placeholder((m,), name="A")
B = te.compute((m,), lambda i: A[i] + 1, name="B")
C = te.compute((m,), lambda i: B[i] * 2, name="C")
s = te.create_schedule(C.op)
s[B].compute_at(s[C], C.op.axis[0])
Similarly, when defining computations using tir.Block, we notice that there are situations where combining Blocks are beneficial. For example,
for i, j in tir.grid(128, 128):
with tir.block([128, 128], "A_Block") as [vi, vj]:
A[vi, vj] = tir.float32(0)
for i, j in tir.grid(128, 128):
with tir.block([128, 128], "B_Block") as [vi, vj]:
B[vi, vj] = A[vi, vj]
There are two separate blocks in the above example. We would like to know if compute_at()
will support combining two blocks into one block in the future? If so, what are the conditions on those two blocks needed to be satisfied to enable compute_at()
performing the combination?