I’m pretty ignorant on a lot of topics here, so pre-apologies.
What I want to do:
From python, compile an mxnet model that has a dynamic input size (for different size images), save the byte-code.
Then in C++ load that byte-code in a relay VM and run inferencing, taking dynamically sized inputs at runtime.
I am familiar with the C++ GraphRuntime and I’ve poked around vm.h
, but before I get to far to a dead end, I wanted to ask: Is this currently possible? If so, are there any examples to go off of?
It would seem I would start with python like:
exe = vm.compile(mod, target, etc)
bytecode, lib = exe.save()
# write bytecode to file...
# ...then...
lib.export_library(filename)
Then from C++ something like:
auto mod = tvm::runtime::Module::LoadFromFile(lib_path);
auto exe = tvm::runtime::vm::Executable::Load(byte_code_path, mod);
auto vm = tvm::runtime::vm::VirtualMachine();
vm.LoadExecutable(exe);
// Not sure what to do next ???
Not sure how to proceed