Evaluating TVM's executors

lemo · January 18, 2022, 10:43pm

I’m trying to learn about TVM’s executors and one particular area I’m interested in is comparing runtime characteristics (perf, memory usage).

What would be a good benchmark suite I could use? I see tvm/apps/benchmark and a few tests under tests/python. Anything else?

Any pointers around TVM benchmarks & executors architecture would be much appreciated.

Thanks!

yuchenj · February 11, 2022, 5:08pm

Hi @lemo, one thing you could try to explore different executors is using the relay.build_module.create_executor(kind) API, where kind can be “graph”/“vm”/'debug".

For example, this tutorial loads an ONNX model, compiles it, and executes it in TVM. It uses the graph executor, but you can try the vm executor also by simply replacing “graph” with “vm” in the create_executor API. And you can do further benchmarking and compare their performance.

lemo · February 11, 2022, 5:20pm

Thanks @yuchenj for the tip!

Beyond the ability to switch executors, I’m trying to find out which benchmarks are accepted as interesting and/or representative. Is there a perf regression suite for example? Or a set of models that can be used as a stable baseline?

yuchenj · February 11, 2022, 6:21pm

There are a few benchmark workloads in tvm.relay.testing that you can directly construct and do benchmarking in TVM: tvm.relay.testing — tvm 0.9.dev182+ge718f5a8a documentation. These are representative models such as MobileNet, ResNet, DenseNet, LSTM, and so on. For example, this tutorial talks about how to auto tune a ResNet for NVIDIA GPU.

lemo · February 11, 2022, 6:38pm

Great. This seems to be the same set of models used by gpu_imagenet_bench.py - this gives me more confidence that I’m looking at the right models.

masahi · February 11, 2022, 8:14pm

For benchmarking purposes, I use my repos https://github.com/masahi/torchscript-to-tvm and https://github.com/masahi/tf2-detection-to-tvm