In Python, graph_runtime.set_input
always calls _get_input(key).copyfrom(...)
. Is there a way to pass a numpy array and use its pointer directly without copying?
I am trying to analyze the overhead of calling into a TVM model. I am running the model on CPU, and using the debug runtime I can see that the model latency is between 3.5ms - 4.5ms. However, when I manually time my run_model
function in Python, I see ~5-6ms. My run_model
function consists of graph_runtime’s set_input
, run
, and get_output.asnumpy
. I am trying to reduce the overhead in those areas.
Thanks!