How can I test the performance of a single operator?

Hi guys, I’m new to TVM and I was trying to test the performance of a single operator on NVGPU. So far I found the doc about how to test model benchmark(https://github.com/apache/incubator-tvm/blob/main/apps/benchmark/README.md). And the doc about [Tuning High Performance Convolution on NVIDIA GPUs] (https://tvm.apache.org/docs/tutorials/autotvm/tune_conv2d_cuda.html). But neither of them satisfies my needs. What I want is to test the single op performance instead of testing a whole model nor self-defining an operator and tune it. Can anybody tell me how to do this?

Hello @haozech. There are two ways you can go about benchmarking a single operator. You can either 1. benchmark a specific implementation of the operator or 2. benchmark all implementations of the operator.

For 1, follow the Tuning High Performance Convolution on NVIDIA GPUs tutorial, but skip section 1. In section two, replace "tutorial/conv2d_no_batching" in task = autotvm.task.create("tutorial/conv2d_no_batching", args=(N, H, W, CO, CI, KH, KW, strides, padding), target="cuda") with the name of the implementation you want to benchmark. You can find the name of the implementation by greping the codebase for @autotvm.register_topi_compute. You’ll also have to modify the inputs to they match what the function is expecting. Furthermore, you’ll have to change conv2d_no_batching and conv2d_nchw_python in the last code block with the correct function names (these should be the name of the function annotated with @autotvm.register_topi_compute).

For 2, follow the Auto-tuning a convolutional network for NVIDIA GPU tutorial. Replace get_network with a function that returns a relay function with a single operator like so:

x = relay.Var("x", tvm.relay.TensorType([40, 40]))
y = relay.Var("y", tvm.relay.TensorType([40, 40]))
mod = relay.Function(
    [x, y],
    relay.my_function(x, y)
)
1 Like

Thank you for your reply! It’s really helpful. Well, I found that in Tuning High Performance Convolution on NVIDIA GPUs , the step 2 will do tuning and find the best config. Is there any way to skip tuning and just test the default config’s performance?

By the way, for 2, the function should return 4 values:mod, params, input_shape, output_shape. But I didn’t see the params in the code?

x = relay.Var("x", tvm.relay.TensorType([40, 40]))
y = relay.Var("y", tvm.relay.TensorType([40, 40]))
mod = relay.Function(
    [x, y],
    relay.my_function(x, y)
)

To test without a config, remove the autotvm.apply_history_best with statement.

params is a dictionary mapping from the name of a weight to the actual values of the weight. In this case, you have no weights, so it is just the empty dictionary.

Problem solved. Thank you!