Relay VM from C++

jmorrill · February 10, 2020, 1:21am

I’m pretty ignorant on a lot of topics here, so pre-apologies.

What I want to do:

From python, compile an mxnet model that has a dynamic input size (for different size images), save the byte-code.

Then in C++ load that byte-code in a relay VM and run inferencing, taking dynamically sized inputs at runtime.

I am familiar with the C++ GraphRuntime and I’ve poked around vm.h, but before I get to far to a dead end, I wanted to ask: Is this currently possible? If so, are there any examples to go off of?

It would seem I would start with python like:

exe = vm.compile(mod, target, etc)
bytecode, lib = exe.save()
# write bytecode to file...
# ...then...
lib.export_library(filename)

Then from C++ something like:

auto mod = tvm::runtime::Module::LoadFromFile(lib_path);
auto exe = tvm::runtime::vm::Executable::Load(byte_code_path, mod);
auto vm = tvm::runtime::vm::VirtualMachine();
vm.LoadExecutable(exe);
// Not sure what to do next ???

Not sure how to proceed

zhiics · February 10, 2020, 2:01am

Thanks for your interest. There are some unit tests that show how we can do serialization of models through VM:

github.com

apache/incubator-tvm/blob/master/tests/python/relay/test_vm_serialization.py

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
# pylint: disable=invalid-name, missing-docstring, no-else-return
"""Unit tests for the Relay VM serialization and deserialization."""
import numpy as np

This file has been truncated. show original

I realized that we haven’t updated the sterilization section of the VM doc. I will do it later.

jmorrill · February 17, 2020, 5:23am

Thank you @zhiics . Those examples really helped immensely!

There was still a lot of jumping through the source code to figure out how to implement in C++, but I believe I have it mostly working, though I admit I’ve only ran it once. In case anyone else (@jonso ?) was curious, I’ll post my impl here.
The process involved using python code to write the VM to disk (the .so/dll and the byte code) as described in test_vm_serialization.py. Then loading up those files at runtime in the pure C++ project. Please note, this is first-draft proof of concept quality code:

class tvm_virtual_machine
{
public:

    static std::shared_ptr<tvm_virtual_machine> create(const std::string& lib_path, const std::string& byte_code_path)
    {
        return std::shared_ptr<tvm_virtual_machine>(new tvm_virtual_machine(lib_path, byte_code_path));
    }

    void init(const TVMContext& ctx)
    {
        _init_fun(int(ctx.device_type), int(ctx.device_id));
    }

    std::vector<tvm::runtime::NDArray> run(const tvm::runtime::NDArray& data)
    {
        set_input("main", data);
        const tvm::runtime::ObjectRef objRef = _invoke_fun("main");

        auto adtObj = objRef.as<tvm::runtime::ADTObj>();

        std::vector<tvm::runtime::NDArray> outputs;
        outputs.reserve(size_t(adtObj->size));

        for (size_t i = 0; i < adtObj->size; ++i) 
        {
            auto obj = (*adtObj)[i];
            auto nd_array = tvm::runtime::Downcast<tvm::runtime::NDArray>(obj);
            outputs.emplace_back(std::move(nd_array));
        }

        return outputs;
    }

private:
    tvm_virtual_machine(const std::string& lib_path, const std::string& byte_code_path)
    {
        auto lib = tvm::runtime::Module::LoadFromFile(lib_path);
        std::string code;

        {
            std::ifstream fin(byte_code_path, std::ios::binary);

            fin.seekg(0, std::ios::end);
            code.reserve(fin.tellg());
            fin.seekg(0, std::ios::beg);

            code.assign(std::istreambuf_iterator<char>(fin),
                std::istreambuf_iterator<char>());
        }

        _module = tvm::runtime::vm::Executable::Load(code, lib);

        _exec = dynamic_cast<tvm::runtime::vm::Executable*>(_module.operator->());

        _virtual_machine = tvm::runtime::make_object<tvm::runtime::vm::VirtualMachine>();

        _virtual_machine->LoadExecutable(_exec);

        _init_fun = _virtual_machine->GetFunction("init", _virtual_machine);
        _set_input_fun = _virtual_machine->GetFunction("set_input", _virtual_machine);
        _invoke_fun = _virtual_machine->GetFunction("invoke", _virtual_machine);
    }

    void set_input(const std::string& function_name, const tvm::runtime::NDArray& data)
    {
        _set_input_fun(function_name, data);
    }

private:
    tvm::runtime::ObjectPtr<tvm::runtime::vm::VirtualMachine> _virtual_machine;
    tvm::runtime::Module _module;
    tvm::runtime::PackedFunc _init_fun;
    tvm::runtime::PackedFunc _set_input_fun;
    tvm::runtime::PackedFunc _invoke_fun;
    tvm::runtime::vm::Executable* _exec;
};

jmorrill · February 17, 2020, 6:36pm

The next question I have is on dynamic input shapes. I see the docs say:

" How do we handle dynamic shapes?

TODO"

Which roughly translates to “not finished yet.”

Is it possible to share what this will look like in terms of workflow and/or client API usage, or is it still too early?

Thanks for any info!

zhiics · February 17, 2020, 8:46pm

We have done a lot to support dynamic shapes, but the doc hasn’t been updated yet.

@haichen @jroesch @wweic we may need to update the dynamic shape section of the VM tutorial.

jmorrill · February 18, 2020, 8:41am

Thanks!

Just checking around, it looks like the changes for dynamic shapes is held up in this PR…at least that’s my best guess.

hsuan · September 8, 2020, 11:17am

Hi, I am integrating vm with c++ api recently, does this example still work?

I got segfault with adtObj->size…