Hello.
I found out that it is difficult to compile a module that supports the dynamic batch size (i.e. multiple sizes of batches as inputs.)
As an alternative, what I want to do is to compile the DNN model (written in frameworks like PyTorch, ONNX, Tensorflow, etc) into multiple modules for different batch sizes and let the modules share the parameters (i.e., weight). To be specific, I want to load the parameter only once (on main memory or GPU memory). Then, I launch multiple modules which are basically running the same DNN model but compiled for different batch sizes and make the modules share the pre-loaded parameters.
I believe, since (weight) parameters won’t be changed due to compilation, I can reuse a single parameter data for multiple modules which are actually running the same DNN model. Is it possible with TVM?
Thanks a lot.