Graph_runtime.set_input zero-copy from Python

In Python, graph_runtime.set_input always calls _get_input(key).copyfrom(...). Is there a way to pass a numpy array and use its pointer directly without copying?

I am trying to analyze the overhead of calling into a TVM model. I am running the model on CPU, and using the debug runtime I can see that the model latency is between 3.5ms - 4.5ms. However, when I manually time my run_model function in Python, I see ~5-6ms. My run_model function consists of graph_runtime’s set_input, run, and get_output.asnumpy. I am trying to reduce the overhead in those areas.

Thanks!

Ping, has anyone dealt with this before? Maybe @haichen do you have any thoughts?

Currently there is no zero copy option from numpy ndarray to TVM ndarray. If the input is already in TVM ndarray, there’s an API called set_input_zero_copy to avoid copy the data.