Hi Everyone,
I am trying to profile my model to know the execution time and memory transfer sizes i.e load/store for each operator. I am using tvm.contrib.debugger
, but it gives the total execution time and does not specify how much time it takes for compute vs memory load/stores.
I have gone through the article Getting Started With PAPI — tvm 0.10.dev0 documentation, but couldn’t find how to configure it through RPC.
any help?
Thanks