Thanks, @tqchen for bringing it up. TVMUnity provides more flexibility for performance improvement and new hardware integration. Here I’d like to share some of my experiences during using TVMUnity:
Cross-layer Optimization
Layout is a cross-layer element for end2end models, influencing both graph-level representation and low-level TensorIR optimization. Most existing works (e.g., TensorRT) optimize layout by specifying an “optimal” layout by human experts. However, a full automation solution for layout rewriting (both weight layout and data layout) will be possible with TVMUnity infra.
Interactive Transformations
Interactive transformations are introduced by TensorIR first and have received lots of positive feedback. It would be an exciting milestone if we could use one programming language (i.e. TVMScript) for end2end models, and every transformation works around one central conception: IRModule.
Also, Relax will solve some of the limitations of Relay, for example, dynamic-shape and training infra in the future. Happy to see this pre-RFC and look forward to the following upstream commits 