** Introduction **
This RFC aims to detail the integration of microTVM into TVMC and the tvmc
commands their options available when working with targeting devices supported by microTVM.
Currently in microTVM the entire compile/flash/run/debug process is driven end-to-end by the user, via microTVM Python API modules, as it happens in ‘tutorials/micro/micro_tflite.py’ script.
TMVC is a command-line driver for TVM (tvmc
) that allows users to better control the compilation, execution, and tuning of several input models on various hardware targets. However it currently doesn’t support microTVM targets, so it can not drive a similar workflow (compile, run, tune) on targets supported by microTVM, like a variety of microcontrollers.
Endowing tmvc
to support the mentioned workflow stages for microTVM targets allows developers to focus on testing, fixing bugs, and improving each stage independently, so not need to drive the whole process end-to-end anymore. Moreover, the workflow proposed by this RFC roughly corresponds to the common steps in the average developer workflow: compile a model, flash it (new stage proposed), run it, and tune.
** Propositions **
To achieve the goal of integrating microTVM to TVMC (tvmc
tool), the following changes are proposed:
1- New command
A new command, ‘flash’, must be added to tvmc
in order to control flashing to the microTVM devices. This adds a new stage in the existing TVMC workflow (compile, run, tune).
2- “Glue” between stages
In order to coordinate between the different stages in the workflow (compile, flash, run, tune) it’s necessary to have a persistent file with metadata describing several aspects of the build (like the SDK used to generate the binary image, the target MCU and the board) and an archive (.tar) containing at least:
a) the generated binary image to be flashed (like zephyr.{elf,hex}
);
b) the graph (mod.json
); and
c) model parameters (mod.params
).
Artifacts a, b, and c are all generated on the compile stage. The artifact a is used then on the flash stage and, finally, b and c artifacts are used on the run stage only.
It’s also proposed that adding information about the target and board used in the compile stage into the metadata file avoids the need to specify the ‘–device’ flag again on next stages, like for ‘run’, since the target and board were already specified previously and can’t change…
Finally, a mechanism should exist to avoid running a model on a target not prepared for it. Hence a signature is proposed to avoid the case when one tries to run a model against a target that was not previously flashed with the correct binary image supporting the necessary operations to run the model. The signature would be generated and stored on compile time as a static member of the RPC server and a simple RPC call whould be available to enable requesting the signature. A copy of the signature would be kept in a metadata file and tvmc
run
command would request the target’s signature via a RPC and compare it against the copy saved in the metadata file. If the signatures match, then tvmc
runs the model, otherwise it aborts the run process and informs the user about the mismatch.
3- Selecting microTVM targets
There are currently two propositions to enable microTVM targets:
- Use flag --micro (see prototype below), which will make available additional flags to select the board (–board) and the SDK used to generate the image (–sdk). That option is quite easy to implement but it turns out to result in a kind of “fork” inside
tvmc
, using not much of the common code currently used for TVM. Example:
$ python3 -m tvm.driver.tvmc compile --micro --targe=stm32f746xx --board=stm32f746g_disco --sdk=zephyr sine_model.tflite
- Build upon PR #7304 [1] and expand --target. The benefits in theory would be that it would allow for a better use of the existing code. Example:
$ python3 -m tvm.driver.tvmc compile --target="zephyr -targe=stm32f746xx -board=stm32f746g_disco" sine_model.tflite
** Prototype **
A prototype leveraging the microTVM API, the same API presented and used by the “micro_tflite.py” script for the micro tutorial is provided as a reference [0]. A typical workflow would be like [2].
The prototype splits the run process in two stages : flash (introducing a new ‘tvmc’ command ‘flash’ for that goal) and run, so when one runs a model it’s not necessary to flash the binary image again, avoiding target flash memory wear and, in some cases, easing the debug flow. For example, on some ARM MCUs a ST-Link is used both for flashing the device and for providing a GDB serve abstraction for debugging, so it’s not possible to attach to the device via GDB and at the same time flash the device, so it’s tricky to debug microTVM using micro_tflite.py
script or a similar one because running is tied to flashing (end-to-end) (tvm.micro.Session(binary=micro_binary, flasher=flasher
). The prototype eases the debugging process because it’s possible to only run the model without flashing a new image automatically.
The prototype doesn’t address the Project API as described in [3]. It also doesn’t implement the auto tuning (‘tune’ command for microTVM targets).
[0] GitHub - gromero/tvm at tvmc
[2] https://people.linaro.org/~gustavo.romero/tvm/tmvc_microtvm_prototype.txt
[3] [µTVM] Mini-Map: the µTVM Developer Workflow
@tqchen @thierry @mdw-octoml @ramana-arm @manupa-arm @leandron @tgall_foo @liangfu @areusch