As we discussed in the Apr. 11, 2023, TVM Unity Open Development Meeting, one of the issues in the current Relax implementation is that the compiler at various points relies on a specific ordering of passes (phase ordering) without clearly advertising this fact. Some of this phase ordering exists for good engineering reasons, so the existence of these dependencies among passes is not itself an issue; rather, what we should address is the fact that many of these dependencies are subtle and are not documented.
Some examples of phase ordering:
-
build()
invm_build.py
uses theToNonDataflow
pass to eliminate dataflow blocks, meaning that the low-level code generation passes do not have to deal with dataflow blocks. -
VMShapeLower
has a comment indicating that it does not deal with nested functions and expectsLambdaLifting
to be called first, though it is not included in the defaultbuild()
. - The default VM code generator expects operators to be legalized, even though this is not included in the default
build()
. - The
LegalizeOps
pass cannot handle certain operators. For example,invoke_closure
, which may be introduced byLambdaLifting
, does not have a legalization, meaning thatLegalizeOps
should be called beforeLambdaLifting
(this is not documented anywhere). - Somewhat related: The
tensor_to_shape
operator is lowered into a builtin byDecomposeOpsForInference
and not byVMLowerBuiltin
. This operator thus creates a hidden dependency onDecomposeOpsForInference
, which is not documented anywhere. - Normally, we expect TIR functions to be called only via
call_tir
(though this is not enforced in the compiler); however,CallTIRRewrite
lowerscall_tir
operator calls into explicit tensor allocations and direct calls to thePrimFunc
s (it would be good to note which passes should expect to deal with such calls and which should not). - Additionally, phase ordering has resulted in headaches in my purity tracking PR, as having to reason about purity makes it very difficult to deal with lower-level code generation (e.g., lowering operators to builtins). This problem was solved by stripping away the purity checks during the default
build()
, but even that creates some issues in cases like using a BYOC custom code generator (see theRunCodegen
pass in that PR).
In the community meeting, we proposed certain measures that we can take to deal with this complexity:
- We should certainly document our expectations as far as pass ordering goes. We could write it out in some file, e.g., have a
src/relax/transform/README.md
to explain this. - For a technical measure, @tqchen proposed using module attributes to indicate what “phase” of compilation the module is on and check that any passes invoked correspond to the correct phase. We could use the well-formedness checker to enforce invariants pertaining to the different phases.
- Another technical measure might be to use the
required_passes
field in the current Relax pass infrastructure (presently unused), though the attendees to the discussion were not in favor of having passes be automatically (and thus “silently”) run, which may surprise users. If we use this field to indicate dependencies, it should ideally be used to give warnings rather than run passes automatically.
To pursue any of these measures, however, we would have to agree on what phases we should have in compilation and which passes should act as transitions between these phases. At the meeting, we noted that at present there are two de facto phases: high-level transformations on a model and then low-level code generation (in build()
). However, we may want to have a finer division of stages (an example was brought up involving GPU code generation, where some options might apply. BYOC might also factor into this discussion, per the example that came up in the purity tracking PR). Additionally, we should decide on what dependencies are acceptable within a phase (should there be an enforced ordering? It might be reasonable to rely on it in build()
).