Can't link soft-float modules with double-float modules

apivovarov · August 31, 2021, 6:40am

We can. I decided to name the parameter -mabi to match gcc. I aslo added several popular sifive CPUs to tvm.target module, e.g.

target=tvm.target.riscv_cpu("sifive-u54")
...
lib.export_library("model.so", cc="riscv64-unknown-linux-gnu-g++", options=["-march=rv64gc", "-mabi=lp64d", "-static-libstdc++"])

github.com/apache/tvm

[RISCV] Add support for llvm parameter -mabi (aka -target-abi)

main ← apivovarov:target-abi

opened 08:32PM - 26 Aug 21 UTC

apivovarov

+85 -3

This PR adds support for llvm/llc parameter `-mabi` (aka `-target-abi`). Targ…et/Machine ABI parameter is needed to compile models for RISC-V cpu. RISC-V has six ABIs - `ilp32`, `ilp32f`, `ilp32d`, `lp64`, `lp64f`, `lp64d`. CPU example is [sifive-u54 (llvm)](https://github.com/llvm/llvm-project/blob/d480f968ad8b56d3ee4a6b6df5532d485b0ad01e/llvm/include/llvm/Support/RISCVTargetParser.def#L23), [sifive-u54 (gcc)](https://github.com/riscv/riscv-gcc/blob/c3911e6425f35e0722129cb30cc5ccaf3390cd75/gcc/config/riscv/riscv-cores.def#L46) Specifying just `mcpu=sifive-u54` parameter is not enough since particular CPU can be used on platforms with different ABIs. When we compile the model we also need to specify target/machine ABI. For example `-mabi=lp64d`. Target string example: ``` llvm -mtriple=riscv64-unknown-linux-gnu -mcpu=sifive-u54 -mabi=lp64d -device=arm_cpu ``` If we do not specify `-mabi=lp64d` then the linking process (`lib.export_library`) which generates `model.so` will fail with the following error ``` can't link soft-float modules with double-float modules ``` The compilation and export_library was tested for `relay.build` single output (GraphExecutorFactoryModule) and for three outputs (graph, lib, params). I also added several popular RISC-V CPUs to `tvm.target` module. ``` sifive-e31 sifive-e76 sifive-u54 sifive-u74 ``` About the parameter name. The name is different in gcc and llc: `gcc` uses `-mabi` but `llc` uses `-target-abi` (`llc` parameter is hidden (not in the help)). Related discussion: https://discuss.tvm.apache.org/t/cant-link-soft-float-modules-with-double-float-modules/10140

areusch · August 31, 2021, 7:55am

cc @heatdh does this solve or affect your problem?

Heatdh · September 1, 2021, 2:57am

Hell Mr @areusch , I didn’t try the fix yet but it seems somehow tackles the same produced error. For me this basically happens while linking the object files after using export_model. The additional problem that our toolchain gcc compiler doesn’t support this, i have just overwritten the cmd list that is launched while exporintg. However the method of manual save of the .s files removing the gnu soft float math libraries __mulsf3, __divsf3, __addsf3, __subsf3 and used gcc without linking then copied the files, the simulation worked fine with no issues for the TVMCodegen however the output was wrong. (for each model this have to be done manually can check the files and remove within a loop but for the utvm some function definitions are missing). @apivovarov i 'm intrested in your usage of -static-libstdc++ . Would you please explain this attribute (i"m afraid that the compiler we are using doesn’t support it but still).

Heatdh · September 1, 2021, 3:56am

Trying the fix on i ran either to the same error as we precise 32bit version (our deployment is on rv32gc). changing the mabi=ilp32 eleminates the function causing the problems (inspected through read elf and exporting to .s file ). However i get elf64 elf 32 error or reversed depending on the target. Note: Any 32bit mention in the target produces these functions.

apivovarov · September 1, 2021, 6:16pm

TVM uses LLVM to generate LLVM IR (model.ll) and then compile it to object file (model.o).

After that the object file should be “linked” to a shared library. TVM can run g++ or clang++ command for you to link the object file (model.o) to a shared library (model.so) - it is implemented inside lib.export_library method. Because the process of generating model.so has two stages - LLVM stage and linking stage (g++/clang++), we need to make sure that both stages generate output for the same mcpu/march/mabi. E.g.

# LLVM compilation for cpu sifive-u54 (llvm knows its arch rv64gc) and use lp64d ABI
target="llvm -mtriple=riscv64-unknown-linux-gnu -mcpu=sifive-u54 -mabi=lp64d -device=arm_cpu"

# Linking for arch rv64gc and use lp64d ABI.
lib.export_library("model.so", cc="riscv64-unknown-linux-gnu-g++", options=["-march=rv64gc", "-mabi=lp64d", "-mcpu=sifive-u54"])

--enable-multilib !!!

The linking process links not just your model.o but other system libraries which model.o needs (C Lib and C++ STL lib). So, your cross-compilation toolchain should have pre-compiled system libraries for mcpu/march/mabi which you are using in TVM. By default riscv-gnu-toolchain only has lib64 rv64gc / lp64d system libraries. Check what ABIs you have

ls <riscv-gnu-toolchain_install_dir>/riscv64-unknown-linux-gnu/lib64
lp64  lp64d

ls <riscv-gnu-toolchain_install_dir>/riscv64-unknown-linux-gnu/lib32
ilp32  ilp32d

To get pre-compiled system libraries for all ABIs use --enable-multilib flag when building riscv-gnu-toolchain

./configure --prefix=/opt/riscv --enable-multilib

--print-multi-lib

You can also use g++ --print-multi-lib flag to list pre-compiled system libraries which your toolchain has. ( march/mabi combinations)

riscv64-unknown-linux-gnu-g++ --print-multi-lib
.;
lib32/ilp32;@march=rv32imac@mabi=ilp32
lib32/ilp32d;@march=rv32imafdc@mabi=ilp32d
lib64/lp64;@march=rv64imac@mabi=lp64

-static-libstdc++

-static-libstdc++ is a flag for the linker (g++/clang++) to include C++ STL functions into you shared library (model.so). In that case the model (model.so) will not need libstdc++.so on the device. If your device has C++ STL library libstdc++.so then you do not need to use -static-libstdc++ flag.

To check model.so dependencies:

readelf -d model.so | grep NEEDED

Heatdh · September 5, 2021, 2:30am

Thanks @apivovarov for the explanation. Sorry here i misstyped and meant ilp64 instead of ilp32 which appears that the generated .s is incompatible with rv32gc as its structure is elf64. for now after exporting the .s and generating the object files i no longer run into the double floating error, however the soft floating point libraries __addsf3,__mulsf3,etc are still used. I tried adding a .c file that catches these functions while linking. The simulation works fine, however the output is wrong for all models simulated. I’m afraid that they are somehow crucial, however in the .s i was not seeing that any variables were passed to them. we are using sifive’s riscv64-unkown-elf and it is already pre-built with multi-lib enabled. the arch supported are the following

rv32e/ilp32e;@march=rv32e@mabi=ilp32e rv32ea/ilp32e;@march=rv32ea@mabi=ilp32e rv32em/ilp32e;@march=rv32em@mabi=ilp32e rv32eac/ilp32e;@march=rv32eac@mabi=ilp32e rv32emac/ilp32e;@march=rv32emac@mabi=ilp32e rv32i/ilp32;@march=rv32i@mabi=ilp32 rv32if/ilp32f;@march=rv32if@mabi=ilp32f rv32ifd/ilp32d;@march=rv32ifd@mabi=ilp32d rv32ia/ilp32;@march=rv32ia@mabi=ilp32 rv32iaf/ilp32f;@march=rv32iaf@mabi=ilp32f rv32imaf/ilp32f;@march=rv32imaf@mabi=ilp32f rv32iafd/ilp32d;@march=rv32iafd@mabi=ilp32d rv32im/ilp32;@march=rv32im@mabi=ilp32 rv32imf/ilp32f;@march=rv32imf@mabi=ilp32f rv32imfc/ilp32f;@march=rv32imfc@mabi=ilp32f rv32imfd/ilp32d;@march=rv32imfd@mabi=ilp32d rv32iac/ilp32;@march=rv32iac@mabi=ilp32 rv32imac/ilp32;@march=rv32imac@mabi=ilp32 rv32imafc/ilp32f;@march=rv32imafc@mabi=ilp32f rv32imafdc/ilp32d;@march=rv32imafdc@mabi=ilp32d rv64i/lp64;@march=rv64i@mabi=lp64

Do you have an idea on what this is depending?

apivovarov · September 5, 2021, 6:34am

march rv32imafc needs mabi ilp32f (with f), mcpu sifive-e76.

march rv32gc needs mabi ilp32d (with d), llvm and gcc do not have mcpu like that.

march rv64gc needs mabi lp64d (without i and with d), mcpu sifive-u54 or sifive-u74

Heatdh · September 5, 2021, 6:01pm

I forgot to mention that setting the mabi indeed solved the double floating issue as well. However the output compared to tvmc is completely wrong . Im afraid that my exclusion of the arithmetic functions used in implementation of some layers to deal with the undefined references led to that.

apivovarov · September 5, 2021, 8:59pm

I tested some models on QEMU rv64gc Ubuntu - the outputs were the same as on x86_64.

qemu-system-riscv64 \
-machine virt -nographic -m 2G -smp 4 \
-bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.elf \
-kernel /usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \
-device virtio-net-device,netdev=eth0 \
-netdev user,id=eth0,hostfwd=tcp::2222-:22 \
-drive file=ubuntu-20.04.2-preinstalled-server-riscv64.img,format=raw,if=virtio

What hardware/emulator/cpu/OS are you using?

Heatdh · September 7, 2021, 7:57am

Hello @apivovarov, I am using a custom instruction set simulator ETISS . I think that this is somehow linked to our interface and its way to process the input, and the implementations are somehow incompatible with the generated kernels. The generated kernels after inspecting uses the implementation in crt_backend_api and bundle (layout matches). however our top level that is needed to process the data uses crt_runtime_api therefore while searching it doesn’t find the proper implementation

    /home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `TVMArrayAlloc':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:70: undefined reference to `TVMNDArray_Empty'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `TVMArrayFree':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:80: undefined reference to `TVMNDArray_Release'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `TVMDeviceAllocDataSpace':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:89: undefined reference to `TVMPlatformMemoryAllocate'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `TVMDeviceFreeDataSpace':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:106: undefined reference to `TVMPlatformMemoryFree'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `TVMFuncRegisterGlobal':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:143: undefined reference to `TVMMutableFuncRegistry_Set'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `TVMModFree':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:192: undefined reference to `TVMSystemLibEntryPoint'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `DecodeFunctionHandle':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:242: undefined reference to `TVMPlatformMemoryFree'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: /home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:245: undefined reference to `TVMPlatformMemoryFree'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `TVMFuncCall':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:276: undefined reference to `TVMFuncRegistry_GetByIndex'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: /home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:285: undefined reference to `TVMFuncRegistry_Lookup'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `TVMCFuncSetReturn':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:378: undefined reference to `TVMPlatformMemoryAllocate'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `RPCGetCRTMaxPacketSize':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:390: undefined reference to `TVMPlatformMemoryAllocate'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: /home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:394: undefined reference to `TVMPlatformMemoryFree'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `TVMInitializeRuntime':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:404: undefined reference to `TVMMutableFuncRegistry_Create'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: /home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:427: undefined reference to `TVMPlatformMemoryFree'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: /home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:427: undefined reference to `TVMPlatformMemoryFree'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `RPCTimeEvaluator':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:487: undefined reference to `TVMPlatformMemoryAllocate'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: /home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:492: undefined reference to `TVMPlatformMemoryAllocate'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: ../libtvm_static_rt.a(crt_runtime_api.c.obj): in function `RunTimeEvaluator':
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:500: undefined reference to `TVMPlatformTimerStart'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: /home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:518: undefined reference to `TVMPlatformTimerStop'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: /home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:538: undefined reference to `TVMPlatformMemoryFree'
/home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/gcc_riscv_dbg/bin/../lib/gcc/riscv64-unknown-elf/8.3.0/../../../../riscv64-unknown-elf/bin/ld: /home/heatdh/Desktop/Ba/ml_on_mcu/deps/install/tvm/src/runtime/crt/common/crt_runtime_api.c:541: undefined reference to `TVMPlatformMemoryFree'

apivovarov · September 7, 2021, 7:26pm

riscv64-gnu-toolchain supports two sys/abi:

riscv64-unknown-linux-gnu - for Linux. It uses glibc
riscv64-unknown-elf - for Bare-metal. It uses Newlib C library instead of glibc.

The output above says riscv64-unknown-elf - which indicates the compilation/linking for Bare-metal/ELF/Newlib.

I have tested TVM on RISC-V GNU/Linux Ubuntu. I have not tried it on bare-metal yet.

Can you share detailed step-by-step instructions on how you compile and run the model on bare-metal/ELF end-to-end?