Does tvm support dynamic input shape?

Hi @puddingfjz, we are working on Relax(Relay Next), which has first-class symbolic shape support. Here is a simple dynamic input shape example. Please check out the overview of Relax and our TVMCon talk for more information.

@ziheng has worked on dynamic-shape workload tuning: DietCode: Automatic Optimization for Dynamic Tensor Programs - TVMCon 2021.

@yongwww is working on supporting LSTM on Relax, and adding control flow support(e.g., closure, while_loop) to Relax.

4 Likes

Thanks for your quick reply!

Is the Dietcode work integrated into TVM now? Are there any examples to show how to use tvm to optimize dynamic input shape operators?

Is there examples of supporting LSTM or which part of code is related to this?

Dietcode is not upstreamed yet afaik, cc @ziheng, @ArmageddonKnight.

We will add an LSTM demo with dynamic input shape to the Relax repo probably next month, we will keep you updated! :slight_smile:

Good summary, thanks. Hope dynamic feature is supported well in TVM ASAP.

Does the demo ready now?

Hi @gfvvz, @yongwww is working on Relax control flow support which is necessary to support LSTM. After it, we will also need to support TensorArray. We have pinned down the design after discussing with the community in the Relax open dev meetings, now Yong is working on the implementations.

We estimate to deliver the LSTM demo(task list and tracking) with dynamic input shape in June. Stay tuned!

Is there any updates regarding the models with dynamic shape inputs?

I run a model with dynamic inputs and there wan’t any improvement for the inference time.

Hi @fnakhaee, did you run the model with dynamic inputs in Relay or Relax?

Currently in Relax, we are working on matching Relay performance for static workloads by adding passes like FuseOps, after that we will shift our focus to improving performance for dynamic workloads, which involves works around dynamic shape tuning as masahi mentioned before and Relax passes such as dynamic memory planning.

Feel free to check out Relax pre-RFC for more details and the discussion we had in the TVM community meeting yesterday: https://youtu.be/2aYWGOYmDFY.

Hi, yuchen. I saw your code example abount dynamic shape.

I changed the target and virtual machine to cuda.

But it got a error about forgot binding thread, so does current relax support dynamic shape with gpu. Or, I need to add other code. @yuchenj

Hi @chenugray,

The default TIR PrimFunc emitted by EmitTE do not have thread binding construct, so Relax currently relied on MetaSchedule to do thread binding when targeting GPU, you just need to perform one tuning trial per task (subgraph) when targeting GPU.

You can find a test case here: https://github.com/tlc-pack/relax/blob/relax/tests/python/relax/test_autotir_integration.py#L131.

@yuchenj ,here is my code example. I wanna run dynamic batch of customer model on gpu.

It seems like MetaSchedule can’t tune on dynmiac tasks.

from __future__ import annotations  # must import to defer parsing of annotations
import numpy as np
import tvm
from tvm import relax, tir
from tvm.relax.testing import nn
import tempfile
from tvm import meta_schedule as ms

builder = relax.BlockBuilder()

input_size = 784
hidden_sizes = [128, 64]
output_size = 10
target = tvm.target.Target(
    "cuda --host=llvm --max_threads_per_block=1024 --max_shared_memory_per_block=49152")
database = ms.database.MemoryDatabase()

with builder.function(name="main"):
    model = nn.Sequential(
        nn.Linear(input_size, hidden_sizes[0]),
        nn.ReLU(),
        nn.Linear(hidden_sizes[0], hidden_sizes[1]),
        nn.ReLU(),
        nn.Linear(hidden_sizes[1], output_size),
        nn.LogSoftmax(),
    )
    # n is a symbolic variable to represent a dynamic batch size
    n = tir.Var("n", "int64")
    data = nn.Placeholder((n, input_size), name="data")
    output = model(data)
    params = [data] + model.parameters()
    builder.emit_func_output(output, params=params)

mod = builder.get()
with tempfile.TemporaryDirectory() as work_dir:
    db = ms.relax_integration.tune_relax(
        mod=mod,
        target=target,
        params=None,
        num_trials_per_iter=2,
        max_trials_per_task=4,
        max_trials_global=4,
        work_dir=work_dir,
        database=database,
    )

relax_ex = ms.relax_integration.compile_relax(
    db, mod=mod, target=target, params=None)
vm = relax.VirtualMachine(relax_ex, tvm.cuda())
params = nn.init_params(mod)
# init parameters
# the input data has a minibatch size of 3
data = tvm.nd.array(np.random.rand(3, input_size).astype(np.float32))
res = vm["main"](data, *params)

I got error like below:

2: tvm::tir::StmtFunctor<tvm::tir::TResult (tvm::tir::Stmt const&)>::InitVTable()::{lambda(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::TResult (tvm::tir::Stmt const&)>*)#4}::_FUN(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::TResult (tvm::tir::Stmt const&)>*)
        at /home/lei.zhang/src_code/cmccperf/relax/include/tvm/tir/stmt_functor.h:115
  1: tvm::tir::StmtFunctor<tvm::tir::TResult (tvm::tir::Stmt const&)>::InitVTable()::{lambda(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::TResult (tvm::tir::Stmt const&)>*)#4}::operator()(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::TResult (tvm::tir::Stmt const&)>*) const
        at /home/lei.zhang/src_code/cmccperf/relax/include/tvm/tir/stmt_functor.h:115
  0: tvm::tir::FlopEstimator::VisitStmt_(tvm::tir::ForNode const*)
        at /home/lei.zhang/src_code/cmccperf/relax/src/tir/analysis/estimate_flops.cc:143
  File "/home/lei.zhang/src_code/cmccperf/relax/src/tir/analysis/estimate_flops.cc", line 143
TVMError: 
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (int_imm) is false: TypeError: Expect the extent of a loop to be IntImm, but gets: tir.Var

Does that need dietcode to tune task for dynamic? And when will it be integrated in TVM?

Hi @chenugray,

The current meta-schedule does not support tuning dynamic workloads. I don’t know the plan of dietcode, but dynamic tuning of course is an important part of Relax, we will go for it (but no clear timeline for now either)

Hi @Hzfengsy ,

I am also interested in dynamic shape feature in TVM and have tried several Relax demo for dynamic input shape, such as nn_module.py and all passed. Meanwhile I want to load a TorchScript model, then convert it to Relax IRModule ONLY once . Then use different input shape as its input data. Is there any toy demo? Thanks~

Unfortunately, there are no vertical examples right now. We are working towards it

Excited to hear that and hope the example can be released ASAP.

没有说中文的么,哈哈,看着英文有点小费劲。我也在跟踪relax的开发,期待有可跑的dynamic-shape demo

I am also interested in dynamic shape feature in TVM, but did not find a suitable example or api interface.

checkout https://github.com/mlc-ai/mlc-llm/, whichi should contains examples about dynamic shape via tvm unity

Any detail code file or link to find the dynamic shape example? Thanks~

def matmul_nn(
    M,
    N,
    K,
    in_dtype="float16",
    out_dtype="float16",
    accum_dtype="float16",
    with_bias=False,
):
    if not isinstance(M, int):
        M = tvm.te.var("m")
    A = te.placeholder((M, K), name="A", dtype=in_dtype)
    B = te.placeholder((K, N), name="B", dtype=in_dtype)
    Bias = te.placeholder((N,), name="Bias", dtype=in_dtype)

    # Describe the matrix multiplication in TE
    k = te.reduce_axis((0, K), name="k")
    C = te.compute(
        (M, N),
        lambda i, j: te.sum(A[i, k].astype(accum_dtype) * B[k, j].astype(accum_dtype), axis=k),
        name="C",
    )
    last_output = C
    if accum_dtype != out_dtype:
        D = te.compute((M, N), lambda i, j: C[i, j].astype(out_dtype), name="D")
        last_output = D

    if with_bias:
        E = te.compute((M, N), lambda i, j: last_output[i, j] + Bias[j], name="E")
        last_output = E

    args = [A, B, Bias, last_output] if with_bias else [A, B, last_output]

    func = te.create_prim_func(args)

    return tvm.IRModule.from_expr(func)

Take te as an example, M = tvm.te.var("m") enables dynamic for M dimension, which also works for tir script.