Hi, I’m trying to use TVM for the Inference LLM model.
I refer to the TVM docs about the Optimize Large Language Model and it works well when there is a single batch case. (Optimize Large Language Model — tvm 0.18.dev0 documentation)
But I can not find or reference how to do batched inference.
Is there any way to execute batched inference for LLM?