We have currently built the infra for Bring-Your-Own-Codegen. For demonstration purpose, a simple CSourceModule style codegen and runtime is used for ccompiler and dnnl (now called oneDNN). CSourceModule runtime works reasonably well on small examples and it is easy to understand. However, it also poses quite a few challenges on development and deployment of relatively large models or models with relatively large inputs.
- The serialization is quite cumbersome as it normally works on per operator and emits a wrapper to invoke the library.
- Handling last constants is difficult. We currently either have to introduce countless assignments or allocate a large chunk of memory on the static segment. These approaches may significantly increase the compilation time.
- For certain backends, like TRT and dnnl, CSourceModule complicates the use of or even makes it impossible to use their execution engine.
This RFC proposes a JSON runtime associated with a JSON serializer for BYOC which effectively solves the above problems. In addition, this type of runtime is more familiar to the community as the graph runtime is more or less in this style and we have already implemented a minimal example runtime. This RFC extends the minimal example and makes it more general to all backends with execution engine.
-
JSON nodes and code generator/serializer
- Data structures to represent the nodes and entries in a json runtime. The serializer converts a Relay program into JSON format.
class JSONGraphNodeEntry {}; class JSONGraphNode {}; SOE // Serialize a Relay program into JSON frormat, graph and params // should be saved in the same artifact class JSONSerializer : public ExprVisitor {};
-
JSONRuntimeDriver
- Deserialize the artifact and manage the initialization and invocation of the runtime.
- Cache the engine when loading the library
JSONRuntimeDriver : public ModuleNode { void Deserialize(); // Deserialize the artifact and engines PackedFunc GetFunctioin(); // Invoke a subgraph using symbol static Module LoadFromBinary(); // Load the JSON binary void SaveToBinary(); // Save the module
-
JSONRuntimeBase
- The base for handling a graph. It will be extended by the concrete backends, like TRT, dnnl, and other accelerators.
class JSONRuntimeBase : public ModuleNode { virtual void Run() = 0; // Invoke an engine virtual void Init() = 0; // Build an engine // Utilities to save and load a json graph. };
-
Open questions
- Symbolic representation of op attribute, i.e.
Expr start
andExpr end
in thearange
op. Normally, we should not offload this type of nodes to accelerators, but how can we serialize them if we want to support as some of them may not be data-dependent? - It’s intuitive for BYOC to be used along with uTVM. How this JSON runtime will be connected with other runtimes like utvm?
- Symbolic representation of op attribute, i.e.
@tqchen @thierry @mbaret @masahi @comaniac @manupa-arm @jonso @ramana-arm