I’m pretty ignorant on a lot of topics here, so pre-apologies.
What I want to do:
From python, compile an mxnet model that has a dynamic input size (for different size images), save the byte-code.
Then in C++ load that byte-code in a relay VM and run inferencing, taking dynamically sized inputs at runtime.
I am familiar with the C++ GraphRuntime and I’ve poked around
vm.h, but before I get to far to a dead end, I wanted to ask: Is this currently possible? If so, are there any examples to go off of?
It would seem I would start with python like:
exe = vm.compile(mod, target, etc) bytecode, lib = exe.save() # write bytecode to file... # ...then... lib.export_library(filename)
Then from C++ something like:
auto mod = tvm::runtime::Module::LoadFromFile(lib_path); auto exe = tvm::runtime::vm::Executable::Load(byte_code_path, mod); auto vm = tvm::runtime::vm::VirtualMachine(); vm.LoadExecutable(exe); // Not sure what to do next ???
Not sure how to proceed