Does TVM have any mechanism support dynamic shape?

I run some tests on bert model , and found that the seq_len varies from to 64 to 512 when using padding. the performance may be affected a lot by the changes of seq length. The model has only one input, input_ids [batch_size, seq_length].

so I compile serveral libraries, but the device memory will grow to the multiple of the origin model’s needs. Is there any methods to handle this?

I don’t need full support of dynamic shape, but support several case of dynamic shape. One compiled library could run serveral seq length cases: (128, 256, 384).

If it has to compiled serveal engines, is there any mechanism to combine the weights that has has the same contents.

I’m also interested in this.

I tried bert with TensorRT, and it support dynamic shape ranges.

As far as I know, TVM uses relay vm to support dynamic shape. You can try it.