Does TVM runtime contain any target specific code?

In theory, when running inference using libtvm_runtime, all target specific code should be contained inside the compiled model itself. Hence libtvm_runtime should not contain any CUDA or Vulcan or OpenCL specific bits. That mean whatever the configuration is in config.cmake https://github.com/apache/incubator-tvm/blob/main/cmake/config.cmake, the output libtvm_runtime binary should not change. Am I right in my assumption?

while the runtime does not need to contain specific compiled model, we do need call into the driver APIs to load the generated code (opencl, spirv, etc) and present them as launchable function.

So the USE_CUDA, USE_OPENCL refers to whether the runtime bundles these device specific runtime (that are backend specific but not model specific).

1 Like