hi @JosseVanDelm,
I agree there have been quite a few changes since the last blog post. We’ll give an updated overview at TVMconf in a couple of weeks’ time.
Do you need to run autotuning to start with, or just run inference? If the latter, you definitely don’t need to bother with any of the OS–I would just try to build with the c
target and link the generated code and graph runtime into a binary for your platform. you could follow the build steps in test_crt.py
and then export the generated code with mod.export_library()
to produce a C file you can compile for your target.
From a time perspective–how practical is it to set up a UART or semihosting connection on your development board? The µTVM code is a bit new right now, so while we don’t want efforts like this to take long, we don’t have documentation sorted just yet for this. Happy to answer questions if you want to pursue this path.
i’ve included some more detailed answers to your questions below.
Andrew
- Do I understand correctly that you are trying to replace the simple openocd interface that needs the read/write/execute functionality with a C runtime and a minimal RPC server that connects through UART? Could you maybe ellaborate on the changes there? Why is this necessary? I suppose to benefit even more from what is already realized in the rest of the TVM stack?
The main driver behind these changes is actually portability for autotuning. None of these changes affect the deployment requirements–µTVM does not assume the presence of an Operating System, and the runtime it requires are more like support functions around e.g. memory allocation, error reporting, etc (the TVMPlatform
functions are the chip-dependent ones).
However, autotuning assumes that the target environment performs the same between runs, and on a bare-metal platform, the only reasonable way to do this is to fully control the set of instructions that execute between SoC reset and model execution. A major limitation of the previous approach was that you’d get different absolute timing numbers depending on which program was loaded in flash.
So to allow reproducible autotuning in a way that’s friendly to first-time users, we needed to choose a portable approach. This is why we’ve introduced the RPC server + Zephyr support. Now, it should be noted that we aren’t requiring you to use Zephyr–we want to make it possible to easily build the RPC server into whichever runtime environment you choose–just, in that case, you need to provide an implementation of the Compiler
and Flasher
classes.
- Zephyr does not support GAP8. To be honest I’m not sure what such a low level OS actually provides. Do I need an OS with the current changes? Would this facilitate deployment? I’ve seen MBED-os being mentioned on both uTVM and GAP8 sides. Could this be an interesting approach?
You don’t need an OS, strictly speaking–you just need a small main()
that can configure the SoC and launch the RPC server (for autotuning) or the graph runtime (for runtime inference). You’ll link different µTVM libraries into each binary (i.e. you’ll also link the RPC server library when autotuning). I have an mBED implementation of Compiler
and Flasher
here you could try, though it needs to be sync’d to main. This could be a good route for you if mBED is well-supported on that board. Or if it’s easier for you to write UART send/receive functions you could just do without an OS.
- With the current proposed changes, isn’t the overhead of running tvm on the device much higher than previously? How do i know an RPC server, a Runtime and an OS leave enough headspace for deploying useful neural networks on the device?
There is an increase in the code overhead and a small increase in memory consumption for autotuning specifically. For deployment, the RPC server isn’t needed, and the OS would be whatever your project needs (if any), so we don’t see a large overhead there. For autotuning, you are typically loading just one operator at a time, so we think the impact should be limited.
- Yesterday I tried to go through the zephyr demo with a debugger, but the dependencies of the test were quite big and difficult to install on my machine. Do you have a proposed debugging strategy maybe? Maybe it’s easiest if I run it inside the CI docker? Or would that be difficult? Sadly I have no experience with this myself.
We have a “Reference VM” that we just need to build and upload, and a tutorial that should be published but is missing like a Sphinx directive to stick it into the correct place in the doc tree. The VM contains all of the Zephyr deps you need, and is a little better way to do this than Docker since USB forwarding with Docker only works with libusb devices. You can try to build these boxes yourself using apps/microtvm/reference-vm/base-box-tool.py
if you don’t want to wait on me to upload them.