This seems to solve similar problems to the ones I was finding with compile_engine (I was exploring this last year: [RFC] Refactor the compile_engine to expose a Relay -> TE translator). This looks like it will be quite a disruptive change so naturally I’m interested in how we can gracefully handle refactors here while preserving on-going work.
I have a few specific questions:
- Is the expected output of this flow a ‘hybrid’ IRModule with a Relay main function and a number of TIR primfuncs? If so, do you think there may be a place for a ‘full TIR’ module instead so that the main function can accurately handle output buffers among other things?
- For BYOC, could we in principle provide an interface for external targets to be compiled down to TIR rather than directly to a runtime module? I am primarily considering here external targets that may wish to benefit from static memory planning.
- If you wanted to customize the TE->TIR lowering to use a custom scheduling method (like this one: [RFC] 'Cascade' Scheduling), would this expose a component that just lowers the Relay to unscheduled TEs?
I need some more time to read through the WIP PR so may have some further questions after that.
Thanks