I am currently exploring a “runtime-less” or “minimal-dependency” deployment strategy using TVM.
While the standard BYOC (Bring Your Own Codegen) pipeline is powerful, it often requires linking against external libraries, which introduces build dependencies for edge devices.
I am trying to implement a pipeline that:
- Performs Operation Pattern Matching on the Relax graph.
- Instead of calling an external library function, it directly parses and extracts only the specific micro-kernel source code (C/C++ kernels) required for that pattern.
- Injects this extracted source code into the TVM C source to create a self-contained, dependency-free binary.
My questions:
- Has anyone attempted this “selective kernel extraction” approach within the TVM ecosystem?
- Are there any existing RFCs, research papers, or projects that focus on stripping down external kernel libraries into standalone snippets for TVM’s source-based codegen?
- Would this approach be better implemented as a custom Relax Pass for code injection, or should it be handled within the External Codegen phase?
I would appreciate any insights, references, or pointers to similar efforts!