[RFC] TensorIR: A schedulable IR for TVM

This looks confusing to me because there is no example showing how to schedule the three blocks.

I’m curious if TIR’s Block construct is strong enough to support holistic fusion like Rammer and HFuse, where different Block has heterogeneous workload but we can fuse them in a single kernel. If so we can make such fusion a primitive in TIR schedule and enlarge the auto-scheduling search space.