The current design of the compile_engine utilises ScheduleGetter to translate a primitive function into a scheduled tensor expression. However, as it is an all-in-one pass, this means it is directly coupled to the schedules defined in TOPI. It would instead be useful to break this into two stages, one which converts the Relay function into an unscheduled TE graph, and another which applies the TOPI-derived scheduling. We can then expose the Relay → TE translator step such that it can be reused by alternative scheduling approaches, for instance the cascading scheduling I outlined here.
In particular, I propose creating a TETranslator pass (deriving from MemoizedExpressionTranslator) and reducing the scope of ScheduleGetter so that it is just an ExprVisitor which picks out the anchor implementation and function name. The TETranslator would then be exposed as an API which could be reused by other components.
If we agree that this change would be valuable, then there is a question over how to name the Relay → TE translator component and where it should live. Here’s my current strawman:
- TETranslator as a new pass in backend/compile_engine.cc
- Expose the translator as a global with:
TVM_REGISTER_GLOBAL("relay.backend._TranslateToTE")
.set_body_typed([](Function prim_func, Target target) {
auto translator = TETranslator(target);
return translator.Translate(prim_func);
});
- Create a python API under compile_engine.py called ‘translate_to_te’
I’ve pushed a WIP PR with this strawman which you can find here.
Thanks