This Mini-(road)Map is a high-level design proposal that describes how µTVM M2 projects 1 (Library Generator), 6 (Project-level API), and 7 (tvmc
integration) come together to form a new µTVM development workflow.
This doc isn’t a formal RFC to any one of these projects—before implementing each, a separate RFC should be submitted. But, small sketches of RFCs are included for each project at the end of the doc. Enough questions and interest in these topics has come up recently that I thought this was worth discussing in some detail in case others want to start working on these projects before I have cycles.
Background
Right now, all µTVM workflows exist entirely in Python scripts, and it’s challenging to integrate with new RTOS and microcontrollers. TVM drives the entire build process, so there is a fairly tight integration between TVM and the build system for any embedded RTOS in the picture. As µTVM matures and tvmc
develops into a proper TVM command-line driver, changes are needed to make a µTVM workflow that is accessible to developers without intimate knowledge of TVM’s APIs.
µTVM Workflows
This section describes the µTVM workflow today and proposes a more developer-friendly workflow that M2 projects 1, 6, and 7 should work towards.
The µTVM Workflow Today
µTVM today supports essentially one workflow: compiling and running models on-device. This workflow is demonstrated micro_tflite
tutorial, which breaks up the process into 4 pieces (with relevant TVM APIs and their return values listed):
- Model Library Generation —
tvm.relay.build() -> CSourceModule
- Firmware compilation —
tvm.micro.build_static_runtime() -> MicroBinary
- Device programming —
tvm.micro.Compiler.flash() -> Transport
- Model execution —
tvm.micro.Session() / tvm.runtime.GraphRuntime() -> NDArray
Though micro_tflite
purports to be a user-facing inference tutorial, its workflow was actually designed around AutoTVM, in which hundreds to thousands of firmware images are compiled and tested on a fleet of devices. That workflow doesn’t necessarily map to a semi-automated developer workflow as would be expected from a command-line tool such as tvmc
.
In particular, though there are ways to execute any one of these pieces standalone, there are some significant limitations to the return values of each process step that make it difficult to pause and resume the process. Certainly, at each step of the way, there is additional state in micro_tflite.py
that isn’t captured in these return values, and which is necessary for later steps. To name a few examples:
- Since PR 7002 and after PR 7398, CSourceModule is actually a collection of C files, and the de-facto way to save to disk produces a tar archive. However, this tar archive doesn’t include any downstream compiler configuration such as CFLAGS or libraries which may be needed by the operator implementation chosen by TVM, nor does it include the C runtime common libraries that those operators depend on.
- The MicroBinary artifact can be saved to and loaded from disk, but many RTOS expect that the build artifacts be left in place on disk for further operations such as flashing and debugging.
- The Transport object returned from Device Programming is live and can’t be tied back to a particular development board. A MicroBinary includes no information that describes the targeted board.
The tvmc
Workflow
I consider the pieces of the workflow described above to roughly correspond to steps in the average developer workflow (but happy to debate). In moving to tvmc
then, we mainly need to address these pause/resume challenges and design a tool that is usable by developers without requiring extensive knowledge of TVM internals. At the center of this challenge are two things:
- The way we save state to disk between each step in the workflow
- The line between TVM and the firmware project that is compiled and flashed onto the device.
A hallmark of bare metal programming is that extreme complexity can be hidden amongst several innocuous lines of code, and even the order in which those operations or compiled can make a difference between functional and broken firmware. Therefore, this RFC proposes that tvmc
should stay out of the firmware project, aside perhaps from an initial Project Generation step (described more below). Any other automation on a project should be performed by that project’s build tool, and TVM should not expect to move the firmware project nor any build artifacts once they are created on disk.
A revised workflow based on this concept is below:
-
Model Library or Project Generation. TVM translates a Relay model into an artifact suitable to be included in a firmware project. See Model Library Format below for a strawman proposal of this format.
The user then has two choices, and their choice defines the output of this step:
- Manually integrate these pieces into a downstream firmware project. The output is described in Model Library Format below.
- Run a script to generate a demo project. Internally, a Model Library Format
.tar
or dirtree is generated, and the ultimate output is the generated project’s directory tree
-
Firmware compilation. The project’s build tool handles this. TVM can invoke the build tool in automated scenarios such as AutoTVM (see below).
-
Device programming. The project’s build tool handles this. TVM can invoke the build tool in automated scenarios such as AutoTVM (see below).
-
Model execution (assumes host-driven). The device is reset and TVM connects to the RPC server over a specified or autodetected UART (or Ethernet or USB peripheral etc). The Graph Runtime is instantiated according to the runtime configuration stored in the generated project or library
.tar
. Simplified Parameters are loaded from the generated project or library.tar
.
Sketch of Project RFCs
Here I briefly sketch some of the important parts of the M2 projects that enable this workflow change. These are not RFCs in their own right, but they describe some important points that each project should contribute towards this workflow vision.
Model Library Format
Whatever type of artifact we are ultimately generating in tvmc
workflow step 1 (Model Library or Project), a necessary step is storing the tvm.relay.build()
output on disk in a standardized format. The point here is to provide familiarity to users above the Python API level and make it possible to automate project generation.
This format should at least keep the same on-disk organization across all configuration options (e.g. c
vs llvm
backend, -link-params
, use of BYOC, graph vs AOT runtime, memory planner, etc). The produced output could be a .tar
or a directory tree.
One possible organization is shown below:
-
metadata.json
- Overall metadata describing this build- TVM target
- Description/hash of original model
- Original parameters?
- Other state needed from model compilation later in the pipeline
-
crt/
- The content ofstandalone_crt
from TVMbuild/
-
lib/
- Stores generated binary libraries -
parameters.json
- JSON-serialized Simplified Parameters (or this could be binary format) -
README.md
- Perhaps a short standardized README for new users -
runtime-config/
- Stores runtime configuration.- For GraphRuntime,
graph.json
should be created in this directory.
- For GraphRuntime,
-
src/
- Stores generated C source
The Project API
The project workflow above can be divided into TVM-standard pieces and project-specific pieces. At present, Zephyr-specific logic is checked-in to the TVM codebase. However, TVM is a complex compiler with many targets, and the CI length can make it a daunting project to contribute to. To facilitate faster collaboration, this RFC proposes that the project-specific pieces be moved into a separate git repository and invoked through a Project-level API. Specifically, in the case of the Zephyr integration, this would be:
-
python/tvm/micro/contrib/zephyr.py
- Zephyr Compiler and Flasher implementations -
tests/micro/qemu/zephyr-runtime
- Embryonic template project - Additional logic to implement the Project API
The full details of this process are left to a future Project-level API RFC, but some are in this Embryonic RFC and a sketch is here:
- To start with, a user obtains these pieces:
- Model and inputs to compile
- TVM repo
- A µTVM Platform Provider — a platform-specific git repo that contains the Project-level API implementation plus any templates needed to generate projects.
- A python (or other language) script lives in the root of the µTVM Platform Provider as e.g.
microtvm.py
. TVM executes this script to begin interacting with the API. - Commands are written to stdin as one-line JSON requests. The script is expected to parse them and write a reply as one-line JSON
- TVM can issue these API commands:
-
GenerateProject(path/to/library.tar, project_config, project_dir)
- Generate a new project inproject_dir
using this particular RTOS and project config. The script should copy itself intoproject_dir
so it can be re-invoked there to issue further commands.
-
- In the generated project_dir, TVM then re-invokes the script to do further operations:
-
Build()
- build firmware binary for this project -
Flash(serial_number)
- flash built firmware binary for this project to device -
Transport()
- open a transport channel that connects to the on-device RPC server
-
tvmc
Integration
This one is largely specified by the workflow given above. Each workflow step is expected to roughly correspond to one tvmc
subcommand. I’ll note a couple of things here:
- In moving the RTOS-specific code out of the TVM repo, the Model Library or Project Generation step is expected to take an additional config option—the path to the µTVM Platform Provider repo
- Each step will probably need platform-specific configuration, and it’s not clear whether there should be a Provider API
tvmc
could use to interrogate additional config options from the Provider Repo or whether some generic key-value thing is sufficient - This proposal leaves a debug command out of TVM. We don’t have to do that, but the hope is that this workflow allows the user to launch their own debug tools rather than relying on
microtvm_debug_shell
as we do today. It’s expected that whatevertvmc
command implements workflow step 4 would be used to drive on-device execution for debug purposes.
Discussion Topics
Some starter topics:
T1. Does this workflow make sense to you? Are there additional use cases or alternate workflow proposals we should support?
T2. Are there concerns with moving platform-specific logic into other git repos?
T3. Does this seem like it will be easier to use or overly complex?
Remember, specifics such as “this seems like it should not be JSON” are more appropriate for each project’s RFC. If they seem particularly important, we could discuss those here.
Finally, I won’t have bandwidth to work on this for a month or so. Community contributions (please send an RFC/PoC if working on a project) would be super-welcome here.
@tqchen @thierry @mdw-octoml @ramana-arm @manupa-arm @leandron @tgall_foo @gromero @liangfu