Combining Separate tir.Block using compute_at()

Thanks @xiacijie for the reply!

  1. Yes, I got what you mean and definitely agree with the definition.
  2. No. We have a primitive called “blockize” that does the opposite (not exactly, but it creates more blocks), and have thought of such a primitive. Development should be fairly simple (~200 lines in core implementation), and we are more than happy to assist if you want :slight_smile:
  3. To be clear, merging blocks is a transformation that itself doesn’t bring performance gain: Block in TensorIR is a construct that creates conceptual isolation, but lowers to nothing - merging blocks or not, it doesn’t affect generated code.

The reason that it is not developed is that we haven’t found a real-world scenario yet where this primitive is useful, and I definitely appreciate it a lot if you could bring up with an example :+1: