Implementation of Hexagon Runtime for Target

nikzb · March 3, 2021, 5:26am

I am currently trying to run TVM Hexagon on an Android device (Snapdragon). Just to get started, I wish to offload a simple matrix multiplication on the aDSP with TVM.

So far my understanding from reading the page “Introducing Hexagon Backend”, I need to build the following:

IDL Libraries
stub library
skel library

I followed the instructions from the git repo and I now have libtvm_runtime.so including the stub and skel libraries.

Following the documentations from Hexagon SDK, I pushed these files to the device:
for the stub: /vendor/lib/
for the skel: /vendor/lib/rfsa/adsp/

After this, I have no clue yet how to proceed forward. Basically my question is how can I run my matmul on the device by running some Python script on my local machine? If the only way is to run on the adb shell, what is the procedure?

I have three ideas for tackling this problem:
My (1) idea was to modify the android_rpc app in some way so that I can run a python script in my local machine and communicate with the phone with RPC. And in that script I can do something like (similar to running on cpu or gpu)
ctx = remote.hexagon(0)
remote.upload(path_dso_dsp)

My idea was that, in the rpc_app, it just needs to change between cpu and opencl, so maybe for hexagon it’s as easy as setting the context appropriately. If this is actually the right way to do it, do I only need to change the runtime found in the src? (e.g. in /app/src/main/libs/arm64-v8a/lib). Right now I can see that it has libc++_shared.so and libtvm4j_runtime_packed.so.

The second solution I can think of is how other examples from the Hexagon SDK documentation are implemented. First, they also build the stub and dynamic libraries (skel) and after that, they build some executable that can be run in the adb shell. I have looked through their project structures and it seems like the src files for actually running something (for example the calculator) is based on C language. If this is the only way, does this mean I have to write in C the same kind of executable to be able to offload my matmul (using TVM) to the DSP?

The third solution I can think of is installing python in Android like here. And from the adb shell I can run this script. Would this form of solution cause some sort of problems like some dependencies of python or TVM might be unresolved?

I am very new to this field. I apologize if I have some misunderstandings in any of the concepts in this matter.

Thank you so much in advance for your help!

CC: Do you have any advice? Any help would be appreciated! @kparzysz

kparzysz · March 5, 2021, 4:16pm

You’re on the right track. Generally, the way to run something on Hexagon is to run an app on the CPU, and have it offload code to Hexagon via the FastRPC mechanism. If your Hexagon code has function foo, and you want to call it from the CPU, you create the IDL description of foos interface, and the generated stub/skel libraries are the CPU/Hexagon sides respectively of the remote call to foo.

The direct communication via C or C++ program running on CPU is certainly possible, but it’s not necessarily convenient for use with TVM: you would use TVM to build a shared library for Hexagon, then you would need to write that C/C++ code that would load it and execute it.

TVM does have a framework to make the whole thing easier. There is a tool called “RPC tracker”, which serves as a hub to which actual RPC servers connect. An app wanting to offload something would connect to the tracker, and from the tracker it would get the connection with the corresponding RPC server. The app could then upload the code for remote execution and run it.

There is already an app that performs the role of the RPC server: it’s the cpp_rpc program, so most of the pieces of the puzzle are present.

Now, the workflow is basically as follows:

Start the RPC tracker on a PC. It will typically listen on port 9190. Once it starts, leave it running.
Set up port forwarding on the device (this is so that the TCP/IP connection between the tracker and the app will work):

     adb forward tcp:5001 tcp:5001
     adb reverse tcp:9190 tcp:9190

Copy the tvm_rpc app to some directory on the device. I usually copy the stub libraries there as well (but it may work with the libraries in /vendor/lib as well). Then start the app (the key is just some identifier that the RPC server will register itself under with the tracker):

     cd /the/path/on/device
     export LD_LIBRARY_PATH=.:${LD_LIBRARY_PATH}  # this is if stub libs are in this dir
     export ADSP_LIBRARY_PATH=/vendor/lib/rfsa/adsp:${ADSP_LIBRARY_PATH}
     ./tvm_rpc server --host=127.0.0.1 --port=5001 --tracker=127.0.0.1:9190 --key=<some-word>

At this point you can run a TVM python code on your PC and have it communicate with the device.

From the point of view of the python code itself, you need to indicate in your schedule which part of your kernel needs to be outlined. If s is the schedule, and Op is the operation which you want to offload, mark it as pipeline:

px, x = s[Op].split(s[Op].op.axis[0], nparts=1)
s[Op].bind(px, tvm.te.thread_axis("pipeline"))

Then do tvm.build with

target = tvm.target.hexagon()
target_host = llvm -mtriple=aarch64-unknown-linux-android26

(you can use a different version of Android as the OS here). The output of tvm.build, let’s call it m, will be a module with code built for AArch64. It will also contain another module in it with the code for Hexagon. Now you should be able to save both of these modules individually:

m.save('cpu.so')
m.imported_modules[0].save('hexagon.so')  # this will also create a .json file

At this point you can establish a connection:

tracker = tvm.rpc.connect_tracker(tracker_host, tracker_port)
remote = tracker.request(key, priority=0, session_timeout=60)

The tracket_host/tracker_port are the hostname and port number of the tracker. You can use 127.0.0.1 and 9190 here. The key parameter is the identifier you used with tvm_rpc app.

Finally, upload all the files to the device, and them load them on the device (you may need to add full path names):

remote.upload('cpu.so')
remote.upload('hexagon.so')
remote.upload('hexagon.tvm_meta.json')

remote_cpu = remote.load_module('cpu.so')
remote_hexagon = remote.load_module('hexagon.so')
remote_cpu.import_module(remote_hexagon)

Here, remote_cpu is the callable remote module which you can invoke:

remote_cpu(a)

Remember that a has to have the remote Hexagon context, i.e. ctx = remote.hexagon(0).

Feel free to ask more questions if something isn’t clear.

nikzb · March 11, 2021, 12:45am

Hi @kparzysz

Thank you very much for your detailed reply!

I am currently stuck somewhere trying to make this work. So for starters, I would like to verify further if the preceding steps I have taken so far are correct:

My understanding from reading the READ.MD of Hexagon Backend Runtime, for either using simulator or actual device, I would need to build two runtime(s).

Uploaded stub libraries to /vendor/bin/
Uploaded skel libraries to /vendor/lib/rfsa/adsp/
Built TVM Runtime for Android - which I pushed to /vendor/bin together with tvm_rpc
Built TVM Runtime for Hexagon (the one with disabled device support for Hexagon device) - which I pushed to /vendor/lib/rfsa/adsp/

I was wondering if uploading both runtime(s) in 3 and 4 (with the same filename libtvm_runtime.so would cause some form of conflict?). Or I totally got this part all wrong. I read the Hexagon docs and it says that an executable would successively look for dynamic libraries at different locations (the same paths I pushed the files to.)

Then, after doing steps 1 to 4, setting up rpc tracker and server, and finally testing matmul, I am getting this error:

root@tvm:/workspace/tvm/apps/android_rpc/tests# python3 matmul.py
Traceback (most recent call last):
File "matmul.py", line 39, in <module>
f.save('cpu.so')
File "/workspace/tvm/python/tvm/runtime/module.py", line 166, in save
_ffi_api.ModuleSaveToFile(self, file_name, fmt)
File "tvm/_ffi/_cython/./packed_func.pxi", line 322, in tvm._ffi._cy3.core.PackedFuncBase.__call__
File "tvm/_ffi/_cython/./packed_func.pxi", line 257, in tvm._ffi._cy3.core.FuncCall
File "tvm/_ffi/_cython/./packed_func.pxi", line 246, in tvm._ffi._cy3.core.FuncCall3
File "tvm/_ffi/_cython/./base.pxi", line 160, in tvm._ffi._cy3.core.CALL
tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (3) /workspace/tvm/build/libtvm.so(TVMFuncCall+0x61) [0x7fe8c456fd61]
[bt] (2) /workspace/tvm/build/libtvm.so(+0x1442f83) [0x7fe8c453ff83]
[bt] (1) /workspace/tvm/build/libtvm.so(tvm::codegen::LLVMModuleNode::SaveToFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x6d6) [0x7fe8c4518a06]
[bt] (0) /workspace/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x67) [0x7fe8c36e24b7]
File "/workspace/tvm/src/target/llvm/llvm_module.cc", line 151
TVMError: Do not know how to save file cpu.so with format=''

This error tells me that instead of the runtime for my PC side that is supposed to run hexagon_module.cc is using llvm_module.cc, which likely explains its inability to properly save the module.

I cannot pinpoint exactly what is causing this. I think the error above is related to how I set up the runtime appropriately. Any hints?

But by any chance, could this be related to the full version of TVM build in my PC? Also, if it helps, I actually built the TVM package with set(USE_HEXAGON_DEVICE sim) temporarily because if I use set(USE_HEXAGON_DEVICE target) instead, I get the exact same error here whenever I simply call import tvm.

Thank you so much again in advance!