Inference Hangs with No Output on Ethos-N When Running TVM Compiled Model

koreaygj · October 28, 2025, 3:04pm

Hi ,

I’m working on optimizing a custom YOLOv11 model for inference on an embedded board that features an Arm Ethos-N78 NPU.

I’ve successfully compiled the model using TVM/tvmc. The compilation targets were set for heterogeneous execution: --target "ethos-n ..." for the NPU and --target "llvm ..." for the CPU. The compilation process itself completes without any errors.

The Problem However, when I try to run inference with the compiled module (.tar file) on the target board, the application simply hangs. It produces no output and no error messages, effectively getting stuck.

My Questions

What are the most common reasons for this kind of behavior (inference hanging with no output) in a TVM + Ethos-N setup?
Are there any recommended methods to debug or check where the model gets stuck during the inference process? For instance, how can I verify which parts of the model were correctly offloaded and where the execution is halting?

Any help or guidance on how to troubleshoot this would be greatly appreciated.

my models in https://github.com/koreaygj/ethos-vision-optimizer/tree/main/models/optimized_npu|

and my inference results like this

Thanks,