[RFC] TensorIR: A schedulable IR for TVM

Thanks for the proposal! This definitely opens more opportunities for performance optimization. Two questions for clarification:

  1. IIUC, based on the proposal and discussion, we will have both TE and TIR, but TE is more like a frontend wrapper of TIR to serve some users that prefer to write high-level DSL. Then, what will we do with the TE schedule primitives? Intuitively, we should still keep them; otherwise TE writers will have no way to schedule their computes, because they know nothing about TIR and blocks.

  2. Does this proposal support dynamic shape (i.e., Any)? For example, can we have something like:

    @tvm.hybrid.script
    def matmul(a: ty.handle, b: ty.handle, c: ty.handle) -> None:
        C = tir.match_buffer(c, (1024, 1024), "float32")
        A = tir.match_buffer(a, (1024, Any), "float32")
        B = tir.match_buffer(b, (Any, 1024), "float32")
        reducer = tir.comm_reducer(lambda x, y: x + y, tir.float32(0))
    
        with tir.block([1024, 1024, tir.reduce_axis(0, 1024)], "C") as [vi, vj, vk]:
            reducer.step(C[vi, vj], A[vi, vk] * B[vk, vj])
    
    s = tir.create_schedule(matmul)
    update = s.get_block("C")
    i, j, k = s.get_axes(update)
    i_o, i_i = s.split(i, bn)
    j_o, j_i = s.split(j, bn)
    k_o, k_i = s.split(k, 4)
    

    In this case, the length of vk (or k) is Any. Can we still apply split to it with a fixed factor

1 Like