Refer to this https://tvm.apache.org/docs/dev/debugger.html, I am trying to do some profiling of my converted-models. But the result is not as expected.
- Running the tvm without debugger, it will use multi-thread since I see the loading on most of my cores.
- However, when I run the tvm with debugger, I only see loading on 1 core.
If that’s the case, then profiling data in the debugger is not the same case when the actual workloads in running? So how can I profile the TVM workloads as the way we use timeline in Tensorflow.
Roughly check the code, it seems debugger would run the op 1 by 1 individually https://github.com/apache/incubator-tvm/blob/d9f009a560fbec1f1f2394fdbbafbe7d43a92768/python/tvm/contrib/debugger/debug_runtime.py#L176-L190.