How can I deploy a relay module (conv2d) for C++ use?

Hi everyone,

I am trying to call a module (created from relay “conv2d”) in C++. Following the example in:

I have exported the library as:

graph, lib, params =, params = dict_params, target=target)

and when I do print(mod) I get the following

def @main(%data: Tensor[(1, 3, 227, 227), float32], %kernel: Tensor[(96, 3, 11, 11), float32]) -> Tensor[(1, 96, 55, 55), float32] {
  nn.conv2d(%data, %kernel, strides=[4, 4], padding=[0, 0, 0, 0], channels=96, kernel_size=[11, 11], out_dtype="float32") /* ty=Tensor[(1, 96, 55, 55), float32] */

I am looking for the function signature, since I need it to pass the right arguments in C++.

From the print(mod) my guess is that the function name is “main”, and that the function signature would be something like my_func(data,kernel,output) or maybe even output = my_func(data,kernel).

However, when I do:

f = mod.GetFunction(“main”)
CHECK(f != nullptr);

in the C++ program, I get

terminate called after throwing an instance of 'dmlc::Error'
  what():  [11:38:05] Check failed: f != nullptr:

That is, mod.GetFunction is not able to retrieve any function with name “main”.

Does anybody know what I am missing and how I can get the signature and function name from a module built out of a relay function?

I really appreciate any help you can provide on this issue. I am tagging @masahi in this post since this seems to be related to

I’m not the expert, but I thought the module function names were: run, get_input, get_output, etc…not “main”

For this, I use the GraphRuntime C++ api, which is quite easy to use once you get setting up the inputs and getting the outputs.


Hi @jmorrill,

Thanks a lot for your reply. My guess is that I may have to use the tvm::runtime environment from C++ similar to what you can do in python when doing:

module = runtime.create(graph, lib, ctx)
module.set_input(input_name, data_tvm)

I was wondering if you know of some examples related to this issue that can be found on the web. So far I have not seen much on how to import the graph and parameters from C++. Any help would be greatly appreciated.

Not sure about the examples, I do something like this to load (hastily cherry picked and abridged from one of my projects)

mod_syslib = Module::LoadFromFile(library_path);
std::ifstream json_in(json_path);
    throw std::runtime_error("could not open json file");

const std::string json_data((std::istreambuf_iterator<char>(json_in)), std::istreambuf_iterator<char>());
const std::vector<TVMContext> ctxs = { context };
graph.Init(json_data, mod_syslib, ctxs);

Then you should have a CPU based NDArray created from NDArray::Empty OR a plain old buffer that you will used for input data. Run() the graph, then graph the output. I threw this together from my head, but should get you started.

auto device_input_array = graph.GetInput(index);
device_input_array.CopyFromBytes(my_input_data_ptr,  input_size);
auto device_output_array = graph.GetOutput(output_index);
device_output_array.CopyToBytes(my_ouput_ptr, output_size);
1 Like

Also, a trick because ifstreams are so mega-slow with MSVC in debug build, I load the large params file with a mem mapped file:

I use this header only lib for mem map and load the params like this:

        std::error_code error;

        mio::mmap_sink rw_mmap = mio::make_mmap_sink(descriptor.params_path, 0, mio::map_entire_file, error);

            throw std::runtime_error(error.message());

        dmlc::MemoryFixedSizeStream stream(rw_mmap.begin(), rw_mmap.size());

        graph->Init(json_data, mod_syslib, ctxs);
1 Like

Thank you very much for your help. I am gonna give it a try using the GraphRuntime C++ API :slight_smile:

what is graph here?? (i mean where does it created?)