If some how we make the llama model static instead of dynamic is it possible to run it using the graph executor after converting to relay module? I have been through a documentation of relax it is possible there but I need to run it with relay instead. Any suggestion or guideline would be greatly appreciated. Thank You!
Relay does not support dynamic-shape and kv-cache. So it’s hard to run through Relay.
We are phasing out Relay, please try moving towards relax
1 Like