TVM community is working on adding Relay dialects. One such example is QNN
that aims to support pre-quantized framework models. Each dialect will tend to have their own transformation passes. Naturally, the question arises where those passes should be called.
First, I think we might want to stick with the same relay.build_module
API to avoid any confusion. This API can take a graph that has dialect operators. But, at the same time, we want to keep build_module.cc
clean, without any dialect code crawling in.
To solve this issue, my proposal is to add a BuildConfig
string option, dialect_name
. Each dialect can have a wrapper around the sequence of passes it wants to run. This wrapper can somehow be registered using a string dialect_name
(example QNN
for QNN dialect). The build_module will call that registered function using the string supplied in the relay BuildConfig dialect_name
before calling any of the Relay
passes. This way, the build_module changes can be minimal, with only one call to the registered function.