As we discussed in the Apr. 11, 2023, TVM Unity Open Development Meeting, one of the issues in the current Relax implementation is that the compiler at various points relies on a specific ordering of passes (phase ordering) without clearly advertising this fact. Some of this phase ordering exists for good engineering reasons, so the existence of these dependencies among passes is not itself an issue; rather, what we should address is the fact that many of these dependencies are subtle and are not documented.
Some examples of phase ordering:
ToNonDataflowpass to eliminate dataflow blocks, meaning that the low-level code generation passes do not have to deal with dataflow blocks.
VMShapeLowerhas a comment indicating that it does not deal with nested functions and expects
LambdaLiftingto be called first, though it is not included in the default
- The default VM code generator expects operators to be legalized, even though this is not included in the default
LegalizeOpspass cannot handle certain operators. For example,
invoke_closure, which may be introduced by
LambdaLifting, does not have a legalization, meaning that
LegalizeOpsshould be called before
LambdaLifting(this is not documented anywhere).
- Somewhat related: The
tensor_to_shapeoperator is lowered into a builtin by
DecomposeOpsForInferenceand not by
VMLowerBuiltin. This operator thus creates a hidden dependency on
DecomposeOpsForInference, which is not documented anywhere.
- Normally, we expect TIR functions to be called only via
call_tir(though this is not enforced in the compiler); however,
call_tiroperator calls into explicit tensor allocations and direct calls to the
PrimFuncs (it would be good to note which passes should expect to deal with such calls and which should not).
- Additionally, phase ordering has resulted in headaches in my purity tracking PR, as having to reason about purity makes it very difficult to deal with lower-level code generation (e.g., lowering operators to builtins). This problem was solved by stripping away the purity checks during the default
build(), but even that creates some issues in cases like using a BYOC custom code generator (see the
RunCodegenpass in that PR).
In the community meeting, we proposed certain measures that we can take to deal with this complexity:
- We should certainly document our expectations as far as pass ordering goes. We could write it out in some file, e.g., have a
src/relax/transform/README.mdto explain this.
- For a technical measure, @tqchen proposed using module attributes to indicate what “phase” of compilation the module is on and check that any passes invoked correspond to the correct phase. We could use the well-formedness checker to enforce invariants pertaining to the different phases.
- Another technical measure might be to use the
required_passesfield in the current Relax pass infrastructure (presently unused), though the attendees to the discussion were not in favor of having passes be automatically (and thus “silently”) run, which may surprise users. If we use this field to indicate dependencies, it should ideally be used to give warnings rather than run passes automatically.
To pursue any of these measures, however, we would have to agree on what phases we should have in compilation and which passes should act as transitions between these phases. At the meeting, we noted that at present there are two de facto phases: high-level transformations on a model and then low-level code generation (in
build()). However, we may want to have a finer division of stages (an example was brought up involving GPU code generation, where some options might apply. BYOC might also factor into this discussion, per the example that came up in the purity tracking PR). Additionally, we should decide on what dependencies are acceptable within a phase (should there be an enforced ordering? It might be reasonable to rely on it in