Very high CPU utilization of TVM inference vs. TensorFlow Lite

Not sure if I understand the problem correctly. But, if you are using “top” to measure the utilization, it might not be good metric. You could have 100% utilization, but you might not be using your vector units efficiently.

What do you use to measure util? If “top” it should mean, that different threads are sharing same CPU.