Is it possible to run tiny llama using relay?

If some how we make the llama model static instead of dynamic is it possible to run it using the graph executor after converting to relay module? I have been through a documentation of relax it is possible there but I need to run it with relay instead. Any suggestion or guideline would be greatly appreciated. Thank You!

Relay does not support dynamic-shape and kv-cache. So it’s hard to run through Relay.

We are phasing out Relay, please try moving towards relax :slight_smile:

1 Like