TVM deploy a C++ example

Hello everybody,

I have an onnx model that I attached a visualization of it here. I want to work with this model and for this purpose, I need the C++ code of it. So a C code of this model is generated by TVM(you can see the generated code here) and I compiled the C code as a shared library. Now I want to write a main function to call the shared library, create some random data, transform it to TVMarrays using TVMArrayAlloc, call TVM functions from the shared library using GetFunction and finally read the output. But since I’m totally new in the TVM‌, I am confused and I‌ don’t know which function/functions should be called from the shared library. Do you have any idea?

1 Like
#include <tvm/runtime/module.h>
#include <tvm/runtime/registry.h>
#include <tvm/runtime/packed_func.h>

#include <fstream>
#include <iterator>
#include <algorithm>

int main()
{
    tvm::runtime::Module mod_syslib = tvm::runtime::Module::LoadFromFile("model.so");

    std::ifstream json_in("model.json", std::ios::in);
    std::string json_data((std::istreambuf_iterator<char>(json_in)), std::istreambuf_iterator<char>());
    json_in.close();

    std::ifstream params_in("model.params", std::ios::binary);
    std::string params_data((std::istreambuf_iterator<char>(params_in)), std::istreambuf_iterator<char>());
    params_in.close();

    TVMByteArray params_arr;
    params_arr.data = params_data.c_str();
    params_arr.size = params_data.length();

    int dtype_code = kDLFloat;
    int dtype_bits = 32;
    int dtype_lanes = 1;
    int device_type = kDLCPU;
    int device_id = 0;

    tvm::runtime::Module mod = (*tvm::runtime::Registry::Get("tvm.graph_runtime.create"))(json_data, mod_syslib, device_type, device_id);

    DLTensor* x;
    int in_ndim = 4;
    int64_t in_shape[4] = {1, 100, 128, 3};
    TVMArrayAlloc(in_shape, in_ndim, dtype_code, dtype_bits, dtype_lanes, device_type, device_id, &x);
    
    int input_byte_size = 1*100*128*3*sizeof(float);
    cv::Mat input_data(100, 128, CV_32FC3); // this is random input you can change to your input

    TVMArrayCopyFromBytes(x, input_data.data, input_byte_size);

    tvm::runtime::PackedFunc set_input = mod.GetFunction("set_input");
    set_input("data", x);

    tvm::runtime::PackedFunc load_params = mod.GetFunction("load_params");
    load_params(params_arr);

    tvm::runtime::PackedFunc run = mod.GetFunction("run");
    run();

    DLTensor* y;
    int out_ndim = 3;
    int64_t out_shape[3] = {1, 100, 6};
    TVMArrayAlloc(out_shape, out_ndim, dtype_code, dtype_bits, dtype_lanes, device_type, device_id, &y);

    tvm::runtime::PackedFunc get_output = mod.GetFunction("get_output");
    get_output(0, y); // after this you may use y.CopyToBytes() to get an output to any array vector or Mat you want

    TVMArrayFree(x);
    TVMArrayFree(y);

    return 0;
}

I hope this code may help you. (I don’t really run this code, so may have some error, but the flow maybe something like this)

1 Like

@juierror Thank you so much for the code.

With this code I got the following error:

terminate called after throwing an instance of 'dmlc::Error'
  what():  [13:30:24] /home/sara/tvm/src/runtime/graph/graph_runtime.cc:181: 
---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------

  Check failed: data->ndim == data_out->ndim (2 vs. 3) : 
Stack trace:
  [bt] (0) /usr/local/lib/libtvm_runtime.so(+0xd8822) [0x7fb4042c1822]
  [bt] (1) /usr/local/lib/libtvm_runtime.so(tvm::runtime::GraphRuntime::CopyOutputTo(int, DLTensor*)+0x22f) [0x7fb4042c49cf]
  [bt] (2) /usr/local/lib/libtvm_runtime.so(+0xdbf41) [0x7fb4042c4f41]
  [bt] (3) ./lstm(+0x477a) [0x5588e761e77a]
  [bt] (4) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7fb4033f2bf7]
  [bt] (5) ./lstm(+0x4f5a) [0x5588e761ef5a]


Aborted (core dumped)

So I changed it to:

int out_ndim = 2;
int64_t out_shape[2] = {100, 6};

and now it works.

1 Like

@sosa3104 Good to know! Thank you too!!

1 Like

hello @juierror and @sosa3104, It seems the “model.so” contain both weight and code. Do you know a way to dump weight and code separately? And how to load them on C++ side.