Is it possible to run tiny llama using relay?

Irfan · February 12, 2025, 6:44pm

If some how we make the llama model static instead of dynamic is it possible to run it using the graph executor after converting to relay module? I have been through a documentation of relax it is possible there but I need to run it with relay instead. Any suggestion or guideline would be greatly appreciated. Thank You!

Hzfengsy · February 13, 2025, 10:46am

Relay does not support dynamic-shape and kv-cache. So it’s hard to run through Relay.

We are phasing out Relay, please try moving towards relax