Integration of muRISCV-NN kernel library in TVM

PhilippvK · July 12, 2022, 11:15am

Our team at the TU Munich (@r.stahl @fabian) has recently open-sourced our work on porting the ARM CMSIS-NN library to RISC-V targets: https://github.com/tum-ei-eda/muriscv-nn

Let me shortly summarize the features:

Support for 3 Modes: Default (Portable C-Code), Packed (P-Extension, sub-word SIMD, v0.9.2), Vector (V-Extension, super-word SIMD, v1.0)
CMSIS-NN Compatibility layer: You can simply use libmuriscv-nn.a instead of lib libcmsis-nn.a and you should be good to go without changing any code.
This allows using the library not only using the TFLite micro framework but also in TVM using the CMSIS-NN BYOC integration.

The main reason for this post is discussing if this library would be a good contribution to TVM bringing the RISC-V support in (Micro)TVM one step further. If there is interest I would be happy to formulate an RFC for this.

However these are my current concerns:

The library is already usable using the existing CMSISNN BYOC implementation. Thus it wouldn’t make sense to copy and paste all the available code for adding muRiscvNN support as well.
The only thing I would like to get rid of if the fake-mapping of the mcpu PassConfig used by the BYOC code to decide which extensions should be enabled. Currently for enabling the P/V-Extension we use --target-cmsis-nn-mcpu=cortex-m33/55 which is quite unintuitive.

I am looking forward to any feedback.

PhilippvK · July 12, 2022, 10:49am

Here are some Metrics generated for the MLPerf Tiny Benchmark. The Instruction Counts are obtained using the Spike Simulator/ISS and therefore not cycle-accurate as RVV1.0 compatible chips are not yet available.

areusch · July 12, 2022, 4:37pm

@PhilippvK this looks like great work! is there a demo script that shows how to integrate muRISCV-NN with TVM? perhaps we could look at the mcpu issue in that context.

PhilippvK · July 13, 2022, 12:22pm

@PhilippvK this looks like great work! is there a demo script that shows how to integrate muRISCV-NN with TVM? perhaps we could look at the mcpu issue in that context.

I recently added some integration tests which can be found here: https://github.com/tum-ei-eda/muriscv-nn/blob/integration-tests/Integration/TVM/tvm_integration_tests.sh

If only considering the V-Extension the complete flow can be broken down to:

# clone muriscvnn
git clone https://github.com/tum-ei-eda/muriscv-nn.git
cd muriscv-nn
git checkout integration-tests

# download toolchain
cd Toolchain && ./download_rv32gcv.sh && cd -
export TOOLCHAIN_DIR=$(pwd)/Toolchain/rv32gcv

# install tvm
virtualenv -p python3.8 .venv # optional
source .venv/bin/activate # optional
pip install "tlcpack-nightly" -f https://tlcpack.ai/wheels
pip install tflite

# install muriscvnn
cmake . -B./build -DUSE_VEXT=ON -DUSE_PEXT=OFF -DTOOLCHAIN=GCC -DRISCV_GCC_PREFIX=$TOOLCHAIN_DIR -DENABLE_TESTS=OFF
cmake --build ./build

# download model
wget -q https://raw.githubusercontent.com/tum-ei-eda/mlonmcu-models/main/resnet/resnet.tflite

# generate mlf
tvmc compile resnet.tflite --runtime crt --executor aot --pass-config "tir.disable_vectorize=1" --pass-config "tir.usmp.enable=1" --pass-config "tir.usmp.algorithm=hill_climb" --opt-level 3 -f mlf --runtime-crt-system-lib 0 --target-c-constants-byte-alignment 4 --target-c-workspace-byte-alignment 4 --target-c-executor aot --target-c-unpacked-api 1 --target-c-interface-api c --output mlf.tar --target cmsis-nn,c --target-cmsis-nn-mcpu=cortex-m55
mkdir -p mlf
tar xf mlf.tar -C mlf/

# build runtime
export PREFIX=$TOOLCHAIN_DIR/bin/riscv32-unknown-elf
cd mlf/runtime
cp template/crt_config-template.h crt_config.h
make common -j`nproc` \
        CRT_CONFIG=crt_config.h \
        CC=$PREFIX-gcc \
        CXX=$PREFIX-g++ \
        RANLIB=$PREFIX-ranlib \
        EXTRA_CFLAGS="-Wno-error=incompatible-pointer-types"
cd -

# build target sw
KERNEL_SRCS=($(find ./mlf/codegen/host/src -name "*.c"))
# The following should be transformed into a makefile
$PREFIX-gcc -o main.elf Integration/TVM/sw/main.c ${KERNEL_SRCS} \
       -I./mlf/runtime/include/ \
       -I./mlf/codegen/host/include \
       -I./Include/CMSIS/NN/Include/ \
       -I./Include/ \
       -L./build/Source/ \
       -L./mlf/runtime/build/ \
       -lmuriscv_nn \
       -lcommon \
       -Wno-implicit-function-declaration \
       -Wno-incompatible-pointer-types \
       -Wno-attributes

# install spike
cd ./Sim/Spike/bin && ./download.sh && cd -

# run simulation
./Sim/Spike/bin/spike --isa=rv32gcv --varch=vlen:1024,elen:32 ./Sim/Spike/bin/pk main.elf

Ideally we whole flow would be based on MicroTVM, which is currently lacking support for Spike Simulation.

For the main discussion of the mcpu mapping only the --target-cmsis-nn-mcpu=cortex-m55 part is relevant. For riscv targets the existing extensions can be obtained either from the -march=rv32gcv string (GCC) or from the -mattr=+v features (LLVM). I think hardcoding mcpu -> MVEI/DSP mappings for RISC-V targets in the cmsis-nn BYOC backend does not sound like a good idea to me. In addition if muRISCV-NN would be a good addition to the TVM ecosystem, we should also consider adding the following:

Add documentation/tutorials
Add Tests
CI Integration (RISC-V Toolchain (GCC/LLVM), Spike Simulator)
MircoTVM Template capable of running Spike