Evaluating TVM's executors

I’m trying to learn about TVM’s executors and one particular area I’m interested in is comparing runtime characteristics (perf, memory usage).

What would be a good benchmark suite I could use? I see tvm/apps/benchmark and a few tests under tests/python. Anything else?

Any pointers around TVM benchmarks & executors architecture would be much appreciated.

Thanks!

1 Like

Hi @lemo, one thing you could try to explore different executors is using the relay.build_module.create_executor(kind) API, where kind can be “graph”/“vm”/'debug".

For example, this tutorial loads an ONNX model, compiles it, and executes it in TVM. It uses the graph executor, but you can try the vm executor also by simply replacing “graph” with “vm” in the create_executor API. And you can do further benchmarking and compare their performance.

1 Like

Thanks @yuchenj for the tip!

Beyond the ability to switch executors, I’m trying to find out which benchmarks are accepted as interesting and/or representative. Is there a perf regression suite for example? Or a set of models that can be used as a stable baseline?

There are a few benchmark workloads in tvm.relay.testing that you can directly construct and do benchmarking in TVM: tvm.relay.testing — tvm 0.9.dev182+ge718f5a8a documentation. These are representative models such as MobileNet, ResNet, DenseNet, LSTM, and so on. For example, this tutorial talks about how to auto tune a ResNet for NVIDIA GPU.

1 Like

Great. This seems to be the same set of models used by gpu_imagenet_bench.py - this gives me more confidence that I’m looking at the right models.

1 Like

For benchmarking purposes, I use my repos https://github.com/masahi/torchscript-to-tvm and https://github.com/masahi/tf2-detection-to-tvm

2 Likes