[TVM Runtime] How to generate a minimum TVM runtime bundle for Python API Usage?

Recently I am doing some work that requires a minimal form of TVM runtime deployment via python API. After walking through the examples from how_to_deploy and some relevant posts on this forum, I found myself still very confused. Here are several of my questions:

  1. Per this tutorial, we can get a minimum runtime of around 300K ~ 600K. How do we do that in the Python API scenario? Can we deploy an already compiled model_runtime.so without build the whole TVM project on a new machine? (Suppose the new machine and the machine we used to compile the model_runtime shares the same environment configuration)

  2. To simplify my confusion: is there any way to bypass the “build TVM” procedure and provide some out-of-the-box read-to-use runtime bundle? (i.e. Just put a minimum dependency package/folder on the target machine and being able to import the model into the whatever python ML pipeline to do the inference job)

Thanks for the help!

@junrushao @merrymercy @masahi @AndrewZhaoLuo @lhutton1 @anijain2305

You mainly need the tvm python package, plus the libtvm_runtime.so (which is maller than tvm itself)

1 Like

Appreciate the reply!

So is it possible to use the same libtvm_runtime.so (by straight copy-paste manner) without building the whole TVM project all over again provided that the target machine has the same configuration as the machine I get that libtvm_runtime.so?

Plus, what is the minimum requirement of the Python package, like just the runtime part for pure inference? Is there any tutorial/post regarding the minimal inference pipeline for Python usage?

likely we do need to build for target machine, but you could simply do make tvm_runtime in the target and it only build runtime. This can be used to build say rasberry pi or other settings

If your target machine have same env (e.g. they are all x86) then copy paste is fine. TVM python package do come with everything but it will detect if it only have runtime so and enable that part of the feature

Many thanks! But as a C++ & CMake newbie, I’m still confused about the mechanism of detecting a tvm_runtime.so, since it seems that directly copying the whole python package together with tvm_runtime.so won’t work even if the target machine is the same: the python API simply can’t find the required runtime shared library (.so).

BTW, Is there any guide doc on Python deployment? I checked Deploy Models and Integrate TVM tutorial and only found a brief C++ API intro.

they are searched in https://github.com/apache/tvm/blob/main/python/tvm/_ffi/libinfo.py#L86

1 Like