Hi all, I’m not familiar to tvm. 2 years ago, I noticed that tvm has two types of runtime: the graph executor and relay vm. The graph executor has good performance but only support static-shape models. The relay vm supports dynamic models but do not have good performance.
Here are my questions:
Is that still the fact about graph executor and relay vm?
How can I keep track of the evolution of the community? There are a lot of posts, and I can’t find a summary easily.
Hi tianqi, thank you for your reply! I read the post and have found some useful knowledge. Seems Relax is the next generation IR after Relay, and will have better support for dynamic shape. Seems it’s still on the way, and the support (compiler, runtime) for relax is not ready?
But I still can’t find the status of some interested subject after a lot of search in the forum and in the manual. I want to ask some more specific questions:
How can I know the status of relay VM:
Support for partially dynamic shape
Benchmark on some models, or overhead compared with graph executor
Is it still available or encouraged to be used or it’s just an experimental work that has been discarded?
something like that. I want to know a method about how can I get these messages.
Is there any blogs or posts introducing these topics?
Forum would be the right place to ask for these questions.
Relax already come with a runtime and compiler, with the model coverage being developed. You can also checkout https://mlc.ai/ to get taste of it.
In terms of relay and its VM backend. Relay is still the encouraged path as of now if you want out of box compilation and VM is still being maintained and supported.
It come with some support for some partially dynamic shape in the form of ? dimension. So we are not able to tell say dynamic dimensions of two operators are the same. It relies on memory-pool which could be slightly worse than static allocator. But we expect on most models it would have similar perf to graph-runtime.