Any insights on Relay Next performance?

Hi everyone!

First of all I’d like to mention that I’m very new to TVM :slight_smile: I’m currently trying to evaluate the performance numbers of the code generated by TVM. My current experiment is to implement the naive version of operators (such as conv2d, matmul, etc.) and then build and run a nn model (for example ResNet 18) using those operators. I came up to conclusion, that the easiest way to do that is to use Relay Next. As far as I know, Relax is on prototype stage, so does anyone have any insights on how fair such benchmarks would be? Are there any known performance regressions (compared to for example latest TVM release)?

Hi @Grigory,

Glad to hear that you are interested in Relax and think it’s easy to use!

Relax has been integrated with MetaSchedule (the latest auto-tuning infrastructure in TVM), and we have benchmarked that Relax-MetaSchedule achieves the same performance as the current TVM main on ResNet on Nvidia V100 GPU.

One thing to note is right now Relax does not have a lot of operators, but the community is adding high-level ops to it incrementally (e.g., https://github.com/tlc-pack/relax/pull/266). While we are building our own op sets/op infra and importers, you can use the relay-to-relax translator to translate a Relay model into Relax and start from there, an example to translate a Relay ResNet model to Relax is here.

If your goal is to do benchmarking, you can apply Relax fusion pass (FuseOps, and FuseTIR) on the graph, and auto-tune the Relax program by MetaSchedule. Here is an e2e tuning script, and here is the script to tune ResNet workload over RPC on localhost using the e2e tuning script.

If you want to learn TVM and Relax, the best resource in my mind is the mlc course (Episode 4, 6, 9 are about Relax). Here is a Jupyter notebook about a series of Relax demos which demonstrate the overall compilation and execution flow, and includes the tuning part: relax_demo.ipynb · GitHub.

We are happy to answer your followup question! :slight_smile:

4 Likes

Thank you for such a quick and detailed answer! MLC course seems to be extremely helpful for my needs.

1 Like

Relax has been integrated with MetaSchedule (the latest auto-tuning infrastructure in TVM), and we have benchmarked that Relax-MetaSchedule achieves the same performance as the current TVM main on ResNet on Nvidia V100 GPU.

In your demo code, I can’t see where using the gpu as a backend.

How does current MetaSchedule turnning on gpu backend?

answered in the other thread: Does tvm support dynamic input shape? - #16 by yuchenj