As I do not have one of the supported boards on hand, how difficult is it to port the current state of µTVM to a different board. I happen to have a couple of STM32L496 Nucleo-144 boards.
These are Cortex-M4 based, instead of the Cortex-M7.
What do I have to do to geht the microtvm-eval blog repository to work?
thank you, that helped a lot, as I have missed some of these locations.
I tried running the blogpost demo, but it is unable to flash the firmware to the device as it will timeout after 10 seconds, but that is most likely not a TVM problem. (I am not an expert on these development boards)
hi @max1996, can you share a pointer to your code and I may be able to help? unfortunately we need to write up some documentation to describe how to port to a new board.
hi @areusch,
I will create a public fork to share the current state, but the changes are small.
The main differences are the added entries for the platform in some dictionaries.
The full error at the flashing and running step looks like this:
I was able to find the terminal output of the flashing process
Open On-Chip Debugger 0.10.0+dev-01341-g580d06d9d-dirty (2020-05-16-15:41)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
0665FF544851717867230824
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
Info : clock speed 500 kHz
Info : STLINK V2J30M20 (API v2) VID:PID 0483:374B
Info : Target voltage: 3.261782
Info : stm32l4x.cpu: hardware has 6 breakpoints, 4 watchpoints
Info : Listening on port 3333 for gdb connections
TargetName Type Endian TapName State
-- ------------------ ---------- ------ ------------------ ------------
0* stm32l4x.cpu hla_target little stm32l4x.cpu running
Info : Unable to match requested speed 500 kHz, using 480 kHz
Info : Unable to match requested speed 500 kHz, using 480 kHz
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x08001df0 msp: 0x2003e368
Info : device idcode = 0x20006461 (STM32L49/L4Axx - Rev: B)
Info : flash size = 1024kbytes
Info : flash mode : dual-bank
Warn : Adding extra erase range, 0x08021768 .. 0x080217ff
auto erase enabled
wrote 137064 bytes from file /tmp/tvm-debug-mode-tempdirs/2021-01-28T09-48-51___7oxhvrt4/00000/build/runtime/zephyr/zephyr.hex in 8.824641s (15.168 KiB/s)
Info : Unable to match requested speed 500 kHz, using 480 kHz
Info : Unable to match requested speed 500 kHz, using 480 kHz
target halted due to debug-request, current mode: Thread
xPSR: 0x01000000 pc: 0x08001df0 msp: 0x2003e368
verified 137064 bytes in 4.516465s (29.636 KiB/s)
Info : Unable to match requested speed 500 kHz, using 480 kHz
Info : Unable to match requested speed 500 kHz, using 480 kHz
shutdown command invoked
hi @max1996, thanks for posting that up. your changes look fine to me. at this point i’d guess there is a problem with either:
the RPC server is not starting properly (you should see uTVM On-Device Runtime written on the UART immediately after startup)
TVM is trying to use the wrong serial port to communicate
the session setup logic is causing the board to crash. This would typically only happen if there were problems allocating memory on the board, but some memory is allocated during startup so it’s unlikely.
I think we have debugged cause 3 fairly well–I would see if you can investigate cause 1 or 2. here are some pointers:
you should find the generated Zephyr project under workspace/builds/<datetime>. try flashing the project and then use python -mserial.tools.miniterm <port> 115200 to verify you see that debug output. if not, use west debug to launch GDB and investigate. Most likely, you need to adjust the size of your memory pool or the UART being used for zephyr_console output
if you do see that traffic on the serial port, verify TVM is using the correct port. see python/tvm/micro/transport/serial.py and python/tvm/micro/contrib/zephyr.py.
you can also use --debug-micro-execution to debug the firmware binary while TVM is sending it commands. You need to launch a separate terminal window and run python -mtvm.exec.microtvm_debug_shell, and then TVM will launch GDB in that terminal after it has flashed the device
thank you for the initial pointers. I am not very experienced with such embedded devices, but I think I managed to follow your points:
I guess the workspace/builds pass only applies, when I am not using the Jupyter Notebook to run the example, so I took the files from the path that is returned in one of its cells
west flash returns the same error as if it would be executed from Python
west debug seems to indicate, that just a reset has been executed and it is idleing afterwards
the flash output (for the non-standalone projects) looks correct to me. the next step in debugging is to open a serial console with python -m serial.tools.miniterm and verify whether or not you see the debug uTVM On-Device Runtime message printed. you need to discover the block device assigned to your Nucleo board’s Virtual COM Port. this is typically of the form /dev/ttyACMn, /dev/ttyUSBn, or sometimes /dev/tty.usb*. The easiest way to explain is to look for these in /dev when the PCB is connected, then unplug the board and see what disappears.
Once you know that, run python -mserial.tools.miniterm <path to block device> 115200 and you should be able to confirm whether you see the debug message. if you do see the message, it’s likely TVM is just using the wrong block device internally (it auto-detects the device to use, but this logic is sort of hopefully per-board-family). you can probably fix it by overriding the chosen block device in Python. otherwise, there is likely an error in startup between Zephyr and the µTVM Runtime. this error may be unfortunately tricky to debug over the forum.
The standalone unfortunately requires 512KB RAM so it’s unlikely it will work with your particular dev board. You could eventually try supplying a smaller network, though.
thank you very much for your help.
I seems like you are right.
I looked at the serial output while trying to run the example jupyter notebook and it looks like the board is stuck in a loop due to TVM using the wrong device for communication.
Okay, I finally got my hands on a F746ZG board and experienced the same problem.
I added some print output for the port path and it is using the correct device (/dev/ttyACM0)
The error seems to alternate between timeout during the handshake and “Check failed: bytes_consumed <= pending_chunk_.size() == false: consumed 18446744073709551605 want <= 149”
Does that indicate some kind of race condition?
I tried to insert some prints, to get more information on where it fails, but I could not find out anything useful
I think perhaps the runtime in the blogpost did not get TVMPlatformGenerateRandom yet, so it’s using the default weak-linked impl that doesn’t do anything (but which is useful when doing standalone deployment).
Hi @areusch ,
sorry to bother you again, but the linked PR did not work out with the F746ZG.
I realized that the system time of my VM was fluctuating a lot and after fixing it, I was able to get debug information during the execution process (it outputs a lot) and during the handshake the problem really seems to be related to the session id / random generator
here the seemingly important part (this part repeats until it times out):
I tested it with and without the pull request, as well as my modified (with Nucleo L496ZG) and the original version ( using the Nucleo F746ZG). The error stays the same.
I did not find “CONFIG_TIMER_RANDOM_GENERATOR=y” in any project file inside the blogpost repository, but it is present in the pro.conf in tvm (/home/max/github/tvm/tests/micro/qemu/zephyr-runtime/prj.conf). Is this correct? Where do I have to add this line, as the prj.conf file in the blogpost repository seems to be created for each run of the build process?
ah yeah–it depends per-board. try adding it to the template prj.conf (you will need to comment line 46). that prj.conf will be copied into each created project tree under workspace/
warning: The choice symbol TIMER_RANDOM_GENERATOR (defined at subsys/random/Kconfig:29) was selected
(set =y), but no symbol ended up as the choice selection. See
http://docs.zephyrproject.org/latest/reference/kconfig/CONFIG_TIMER_RANDOM_GENERATOR.html and/or
look up TIMER_RANDOM_GENERATOR in the menuconfig/guiconfig interface. The Application Development
Primer, Setting Configuration Values, and Kconfig - Tips and Best Practices sections of the manual
might be helpful too.
and if do not pass the --log-level=DEBUG argument, it will get stuck at the same point, but without reaching the timeout.
If I pass the --log-level argument, it will timeout will trying to connect, just as before.
As I am now using the same board, that you used in the demo and it is still not working, I will try to erase everything and start with a new VM. I must have done something wrong
EDIT:
I recloned the TVM and blogpost repositories, used the same board as you did and it worked after I removed
CONFIG_RESET_ON_FATAL_ERROR=n
from the prj.conf (it would not compile otherwise) and it is now working correctly (did not try with the pull request again yet), but I can now proceed to readd my changes for the L496ZG and test them.
Thank you very
EDIT EDIT:
I was able to modify a fork of your repository as well as of TVM to support the Nucleo-L496ZG board.
would it be useful to merge this changes into TVM to enable support for “smaller” CPUs(the fork of the blogpost would be less relevant)?
ah interesting, okay. i’m not sure what may have been wrong before, but i’m glad it’s working now.
i took a look at your changes and they look fine overall. we may need to think a little about templating the prj.conf and generating it per-board (actually I think the one checked-in at master right now works with nRF boards and not STM, so I should fix that).
do you want to open some PRs and we can iterate there?