Evolving and Modernize Tensor-level IR

I strongly agree with this proposal! The schedule-based paradigm has been extremely effective and remains valuable for many workloads, but it is becoming clear that it cannot serve as the only foundational abstraction for all optimization needs, especially on modern GPUs.

Framing this as a separation between s-tir and a schedule-independent core TIR resonates with my own observations. While schedule-based approaches remain valuable, they are increasingly asked to express concerns that are more naturally part of the program itself. Treating this as an evolution rather than a replacement preserves existing investments, while allowing the core IR to focus on low-level program structure, synchronization, and memory behavior. This also clarifies the role of TIR as a Python-first, low-level kernel programming substrate, making it easier to support emerging hardware features and new optimization patterns without overloading the schedule abstraction. I support the community moving in this direction and would be happy to discuss more as this effort progresses!

1 Like