Introduction
Silicon Labs produces low-power wireless Cortex-M based MCUs. Some of the MCUs, such as the EFR32xG24, contain an AI/ML hardware accelerator called MVP (Matrix Vector Processor). The goal of this post is to initiate a discussion on how to integrate support for this accelerator into the TVM project.
Accelerator
The hardware accelerator functionality is exposed to users through a C API in the Gecko SDK (Silicon Labs Software Development Kit). Our goal is to add the EFR32xG24 as a target in TVM so that users can use TVM to optimize a model and generate C code that can run on an EFR32xG24 embedded device.
Proposal
Our plan is to implement Relay partitioning and C codegen. The work will consist of the following parts.
Build system
The Silicon Labs HW accelerator support in TVM can be enabled by configuring the USE_SILABS configuration variable. This will cause the TVM build system to locate the correct source files. The source code related to the contribution will be placed under python/tvm/relay/backend/contrib/silabs, src/relay/backend/contrib/silabs, and src/relay/op/contrib/silabs. Our goal is also to keep as much as possible of the contribution in the python source tree.
Operator support
Partition patterns will be created to select the operator sequences that are supported by the HW accelerator. We are targeting a small subset of the int8 quantized operators. These operators are add, conv2d, dense, mul, avg_pool2d, max_pool2d.
Note that the ALU inside the accelerator is operating on half-precision 16-bit float data. Therefore, a part of the extra work that differentiates this from prior art like CMSIS-NN is that we will have to add code that converts quantization parameters and certain constant nodes to float16.
Target
We will add a new target called “silabs_mvpv1” with associated RelayToTIR and TIRToRuntime functions.
Codegen
The codegen part of the contribution will reuse CodeGenCHost to generate calls into the C API provided by the Gecko SDK.
Prior work
When working on this addition to TVM we have been inspired by the CMSIS-NN and Ethos-U work that has been done by Arm.
Testing
A full system test will require physical hardware so our plan is to create unit tests that can test the Relay partitioning and TIR lowering. The tests will be placed in the test/python/contrib/test_silabs directory.
Conclusion
Does this work require an RFC document as described in the RFC Process or is this work something that we can provide as pull-requests to the TVM project without an RFC?