RFC/Discussion : uTVM & Zephyr Improvements

tgall_foo · December 14, 2020, 6:30pm

As I’ve been getting up to speed with the current code and getting it working with my STM32F746, I’ve been thinking we should perhaps consider a few things.

When it comes to Zephyr and/or mBed or other RTOSes I’d like to suggest reference projects would be substantially better if we pushed that into those respective communities.

In the case of Zephyr this might look like a sample app. Positive: A Zephyr user as part their environment essentially gets the µTVM app for free. Positive: Those not familiar with µTVM but familiar with Zephyr might be more attracted to make use of µTVM presuming it would be easier to discover as part of the Zephyr project. Positive: Zephyr has CMSIS integrated Negative: External modules and keeping those in sync across multiple RTOS projects tho having the same project utilize a TVM environment variable might solve this issue.

There is precedence for this, TF Lite Micro for instance is integrated into Zephyr.

µTVM should resist as much as possible hard coding board information especially when making use of an RTOS.

consider : python/tvm/micro/contrib/zephyr.py

BOARD_USB_FIND_KW = { “nucleo_f746zg”: {“idVendor”: 0x0483, “idProduct”: 0x374B},

I think we have to catch this kind of thing. mBed, Zephyr, etc, RTOSes. I’d like to think that an object like ZephyrFlasher should be able to make use of the Zephyr tooling.

tvmc integration

I’d like to suggest that we should work to include µTVM support within tvmc. Within the Microcontroller ecosystem making things easy to use for a population of developers who are new to the world of Machine Learning Compilers. @ramana-arm

Let’s start another thread specific for this topic and hammer out what an interface should look like.

areusch · December 15, 2020, 11:58pm

hi @tom-gall, thanks for posting this.

When it comes to Zephyr and/or mBed or other RTOSes I’d like to suggest reference projects would be substantially better if we pushed that into those respective communities.

I completely agree with this idea and it’s something I’d like us to look at in the near future. The current compilation toolchain has been written to work with AutoTVM, but as I’m finishing development on that, it seems like the design is fairly tightly integrated with TVM. I support the idea of a “project generator interface” inside TVM, whose implementation could live in a separate, project-specific repository. And I would agree with making AutoTVM use this interface as the primary way to compile operators.

µTVM should resist as much as possible hard coding board information especially when making use of an RTOS.

I also agree with this, but I’d note here that python/tvm/micro/contrib is intended to contain RTOS-specific implementations of the generic tvm.micro.Compiler interface. I’d propose that we narrow the tvm.micro.Compiler interface, and move the code in zephyr.py into a separate project akin to utvm-zephyr-runtime.

There is some complexity here that TVM needs to be made aware of and which TVM may need to communicate across the interface. Compilation for AutoTVM means that TVM needs an automated way to drive compilation and flashing across the set of eligible operator schedules:

TVM may choose an operator schedule that requires external libraries to compile. Right now we only have 1 such external library dependence, but as more are added, TVM may need a way to indicate which external libraries it needs. It’s possible we could add a target flag, but this may become cumbersome. TVM could also interrogate this from the project.
for RPC server use, TVM needs a way to determine which serial port or e.g. semihosting console to use for communication.
A user may want to use an existing firmware binary (I.e. skip build, for debugging or reproducibility), but drive inference over the RPC server using TVM. This means TVM will need to tell the project’s Compiler (or derivative) implementation to flash an existing binary without building.

I’d like to suggest that we should work to include µTVM support within tvmc.

This sounds good to me. I might advocate for us to split out the Zephyr-specific stuff first, then propose a tvmc interface that makes sense after the TVM ↔ project interface is clear. I suspect that with tlcpack, users may install older versions of TVM accidentally, and I’d like to be judicious about which commands we include in the tvmc interface.

manupa-arm · December 16, 2020, 4:34pm

For the deployment flow, I think we could have a cleaner flow to produce c sources or object files (I prefer this but c-sources should also be fine in-case the users need manual customization post-tvm). The intent being there artifacts could be obtained via tvmc compile, thus integrator (who integrates the app and RTOS) could use them directly in the project.

However, as @areusch points out the tuning process makes this complicated in the absense of dynamic linking in the device which runs RPC server.

Lately, we’ve been thinking could we use specialized apps as RPC servers for each of RTOS that have auxiliary interface to obtain linking information OR have a light-weight linking process to support mapping platform-specific functions in the server itself – therefore the compiler does not need generate RTOS specific projects and rely on the server to get linking info / have server does the linking. @areusch side question : do we do re-flashing in the process of tuning currently or we just handle the operator code in heap/stack ?

This could allow a cleaner compilation interface while supporting tuning.

thoughts : @mjs @Leo-arm @tqchen

areusch · December 16, 2020, 6:42pm

Lately, we’ve been thinking could we use specialized apps as RPC servers for each of RTOS that have auxiliary interface to obtain linking information OR have a light-weight linking process to support mapping platform-specific functions in the server itself – therefore the compiler does not need generate RTOS specific projects and rely on the server to get linking info / have server does the linking. @areusch side question : do we do re-flashing in the process of tuning currently or we just handle the operator code in heap/stack ?

currently we are reflashing in the case of tuning using Flasher interface. at minimum, we should reset the board as much as possible if between tuning runs. it might become necessary to power it off to reproduce performance in some cases.

I do want to have a discussion about simplifying the Compiler interface and supporting some concept of “loadable libraries,” but let’s move that to a separate thread to avoid hijacking this one. I created this thread to discuss that.