Thoughts on a Simpler Scheduling Language

junrushao · February 11, 2021, 3:27am

Thanks for asking!

We are actively developing a more straightforward scheduling language based on a new IR called TensorIR. The main features include:

Imperative scheduling. Using schedule primitives is as simple as applying independent compiler passes that transform an old TensorIR to a new one. Like PyTorch’s imperative execution, imperative scheduling allows to print and debug the scheduling process at any step, which provides smoother debugging experience compared with schedule-tree based TE scheduling.
Python-first syntax. The TensorIR, at any step of scheduling, can be printed into python syntax, which as well can be parsed back to TensorIR/schedule status, i.e. it is a round-trippable DSL embedded in Python. The syntax is designed to be human readable and easy to manipulate. For example, @spectrometerHBH and @vinx13 recently implemented block-sparse kernels in TensorIR within 20 lines of this python DSL, and then applied auto scheduling on it.
Competitive GEMM performance with auto tensorization. We noticed growing demand for competitive GEMM performance, like you have mentioned in the previous thread. The TensorIR re-designed the tensorization mechanism, allowing direct embedding of tensor instructions (like Tensor Core) and microkernels; It also comes with an auto scheduling framework that allows searching with a XGB-based cost model. With all those mechanisms, it is possible that we have more chance for competitive performance.
New schedule primitives made easy. The imperative style scheduling makes it much easier to introduce more schedule primitives, including loop partitioning, layout rewrite, etc. In your particular case, we have developed a primitive called reverse_compute_at, which computes the consumer under the specific loop of the producer. The shape of the computed region is handled automatically in our schedule - so you don’t have to repetitive splitting, reordering, etc.

RFC: [RFC] TensorIR: A schedulable IR for TVM. (recently we added a few syntactic sugars to make it looks simpler since that RFC)

We are preparing to upstream the codebase, and will closely update with the community with our latest status