Enabling µTVM on different device

sorry for the late reply, I would like to add support for a couple of more boards before doing a pull request, to enable a wider range of test devices. Currently I am looking into Cypress PSoC boards, which might be interesting, as the have a M0 and an M4 on the board.

However, while I tried to port things over to the M0+ for now (M4 needs some additional work, due to boot code on M0), I kept running into the problem, that I can’t compile the example anymore.

I only added the new target board at the same places as the L496ZG before, but it fails with:

make[3]: *** [CMakeFiles/module.dir/<path>/micro-blogpost//workspace/builds/2021-02-23T09-00-55/src/module/lib1.c.obj] Error 1
make[2]: *** [CMakeFiles/module.dir/all] Error 2
make[1]: *** [CMakeFiles/module.dir/rule] Error 2

there seems to be an error in lib1.c (according to CMake):

<path>/micro-blogpost/workspace/builds/2021-02-23T09-00-55/src/module/lib1.c:21:7: error: implicit declaration of function 'read_and_pad' [-Werror=implicit-function-declaration]
   21 |       read_and_pad(&aa[i*A_stride + j*4], (int32_t*) &aa_pad[i*4 + j*4], (int32_t*) &aa_pad[i*4 + j*4 + 2]);
      |       ^~~~~~~~~~~~

My point of confusion is: The same command with the same files worked flawlessly with other boards.

another problem, I just realized is, that some boards need to be switched from flashing to UART mode by pressing a button on the device itself. How can I account for that in TVM? And would that introduce problems for AutoTVM?

hi @max1996,

read_and_pad comes from the CMSIS-NN tree–you may need to include this in your build.

another problem, I just realized is, that some boards need to be switched from flashing to UART mode by pressing a button on the device itself. How can I account for that in TVM? And would that introduce problems for AutoTVM?

Could you give an example board? Perhaps there’s a programmatic way to do this?

Andrew

ok, thank you, as I just reused most of your project and added the new targets, it seems like zephyr is missing the CMSIS-NN module for the Cypress boards? I am not really experienced with zephyr, so I might have made a mistake here…

Until now, I’ve found two boards: The Cypress PSoC 6 BLE Pioneer Board (CY8CKIT-062-BLE) and the Cypress PSoC 6 WiFi-BT Pioneer Board (CY8CKIT-062-WiFi-BT)

I wanted to use them as they consist of a heterogeneous system, made up of an Cortex-M0+ and an M4+, which might be interesting for µTVM.

This problem seems to be related to the KitProg2 (PSoC 5LP) programmer and debugger(CY8C5868LTI-LP039, U2). According to the documentation of the zephyr project and cypress as well, it is required to push the button SW3 on the board as it switches between the mass storage mode for flashing and the serial mode for debugging. Both modes come with different USB ID’s, but adding logik to decide what mode is currently active might be a strange solution as it would require changes to TVM for just two boards.

My suspicion is, that this could apply for more boards and programmers out in the wild.

Hi @max1996,

Did you get a warning about missing the arm_nnsupportfunctions.h header? If not, maybe there is a different problem. In any case, it may make sense for us to inline those functions to remove the dependency, license permitting. That’s another option you could try (needs to be placed in gemm.py). If you do that, it’d be great if you could open a PR.

Hm, I see. This can sometimes happen–usually there is a programmatic way to switch modes, but it might be an undocumented USB endpoint or not known to us. Based off the Zephyr doc, I see the USB programmer µC can switch between SW-DP mode and UART mode based off SW3. Here are some possible options to enable AutoTVM on this part:

  1. You could use semihosting in place of UART for the RPC transport. This would require firmware changes on the device (to call semihosting functions in place of UART) and implementing a new Transport subclass. I think openocd supports semihosting, so you should be able to do this with Zephyr. I think this would mean you wouldn’t need to switch modes?
  2. We’ve contemplated a performance improvement for AutoTVM which would involve loading the task code into RAM over UART and doing a “link” onboard the device (patching a pointer to a function table). This is much more involved, but has the benefit that AutoTVM would speed up considerably and not wear out the flash. We will probably tackle this later in the year, but you could try this now if you like.
  3. You could add code to AutoTVM to tell you to push the switch when appropriate
  4. You may see acceptable performance autotuning on another board with a similar architecture, then programming the Cypress part with that tuning schedule.

I think (1) is the easiest fully-automated thing to implement here for a proper solution.

Andrew

HI @areusch, no, there is no warning about missing headers, and according to the log, they seem to be included

opts ['tvm/include', 'tvm/3rdparty/dlpack/include', 'tvm/3rdparty/libcrc/include', 'tvm/3rdparty/dmlc-core/include', 'tvm/src/runtime/crt/include', 'micro-original/runtimes/zephyr/crt', 'micro-original/3rdparty/CMSIS_5/CMSIS/DSP/Include', 'micro-original/3rdparty/CMSIS_5/CMSIS/NN/Include', 'micro-original/3rdparty/STM32CubeF7/Drivers/CMSIS', '/home/vagrant/zephyr/modules/hal/cmsis/CMSIS/Core/Include']

But the build folders in the workspace directory do not contain anything related to CMSIS NN, DSP has its own subfolder, but NN seems to be missing.

@max1996 I think you should have a file micro-original/3rdparty/CMSIS_5/CMSIS/NN/Include/arm_nnsupportfunctions.h. Do you see read_and_pad defined in that file?

Yes, it is all there. I also checked the flags in the project config and the CONFIG_CMSIS_DSP flag has been set to enabled.

EDIT: apparently, the flag CMSIS_DSP, which surrounds these functions has not been set, I enabled it manually and that solved the issue. I am unsure, if it is lack of support for PSoC boards in Zephyr or if I’ve made a mistake (probably the second possibility)

However, it still fails at the compilation stage, as it treats warnings as errors and warns about a couple of lines in the runtimes/zephyr/main.c file, which has not been changed from its original state for the F746ZG. I have not found a simple way to disable Werror easily using the current flow

sorry @areusch, I forgot to tag you in my response

@max1996 glad you were able to workaround the CMSIS_DSP problem. Looking at the zephyr docs, I see CONFIG_CMSIS_DSP is n by default. Is this the flag you mention, or are you talking about a #define?

could you give some more specifics about the warnings you’re seeing?

hi @areusch , the CONFIG_CMSIS_DSP flag was set in the proj.conf inside of the runtimes/zephyr directory. However, as this did not have any effect, I added the #define to the header file itself (just as a hacky workaround for now)

The other errors seem to be a result of my switch to a newer zephyr version (had to do it for the support of the PSoC board), I missed the errors between the warnings at first, but it looks like zephyr replaced memory pools with memory slabs between the two versions, I am currently trying to adapt µTVM to the usage of slabs instead of pools.

Zephyr Project - Latest - Memory Slab

Zephyr Project - 1.12 - Memory Pools

I have never worked with something like zephyr, so I just tried to replace the memory pool-specific functions with the memory slab variants. However, that might not be enough, as slabs seem to be much less flexible than pools.

EDIT: The memory slabs seem to be working (at least they compile), however, there is a problem with a undefined reference to “z_impl_sys_rand32_get” in microtvm-blogpost-eval/workspace/builds/<individual build>/build/runtime/zephyr/include/generated/syscalls/rand32.h:33

I’ve found the function in GitHub - Zephyr Project - rand32_entropy_device.c, which should be enabled using the CONFIG_ENTROPY_GENERATOR=y flag in the project config, but this also seems to be not working correctly…

hi @max1996,

@dream-math also did an implementation with memory heaps here. I think the heap implementation might be a better fit since memory is not allocated in fixed sizes.

I believe you need to enable an RNG API on some boards. I used this one on boards without a HW RNG.

We are consolidating the various runtimes into a single zephyr runtime generator repository, but it will be some time before that refactoring is complete.

-Andrew