we have a large system that is composed by a mesh of processing elements, each of them having an ARM processor and supporting ML accelerators. We already have a software stack that deals with graph partitioning and mapping for other unrelated tasks, but we want it to support TVM. However, despite the system being globally powerful, its processing elements have the same local limitations as the bare metal devices covered by microTVM.
I would like to ask for your feedback regarding where you think the system would fit. I would highly appreciate any recommendation regarding the inclusion of such backend.
Thanks for your time.