What is the recommended way to generate TIR level operations?

agoston-mc · January 20, 2025, 1:07pm

I want to generate TIR level operations automatically, for that I have seen

te.compute,
tir.ir_builder.

However, I have run into issues with both of them:

te.compute
1. where the computation is not executed over the whole tensor, like a partial update during a computation step
2. where the ouput dtype differs from the input dtype (argmin/max operations)
tir.ir_builder
1. can only be used as an external function, so transformations cannot be applied to it (thread binds, tune etc.)

Because of the above, I am looking into calling the TIR constructors directly, as separate indexing is possible with that, and the resulting tensor is not external.

Therefore, my question is such: which is the current recommended way to go about this issue?