How can I deploy a relay module (conv2d) for C++ use?

sergiomatiz · March 11, 2020, 5:39pm

Hi everyone,

I am trying to call a module (created from relay “conv2d”) in C++. Following the example in:

https://docs.tvm.ai/deploy/cpp_deploy.html

I have exported the library as:

graph, lib, params = relay.build_module.build(mod, params = dict_params, target=target)
lib.export_library("conv2d_cpu_opt.so")

and when I do print(mod) I get the following

def @main(%data: Tensor[(1, 3, 227, 227), float32], %kernel: Tensor[(96, 3, 11, 11), float32]) -> Tensor[(1, 96, 55, 55), float32] {
  nn.conv2d(%data, %kernel, strides=[4, 4], padding=[0, 0, 0, 0], channels=96, kernel_size=[11, 11], out_dtype="float32") /* ty=Tensor[(1, 96, 55, 55), float32] */
}

I am looking for the function signature, since I need it to pass the right arguments in C++.

From the print(mod) my guess is that the function name is “main”, and that the function signature would be something like my_func(data,kernel,output) or maybe even output = my_func(data,kernel).

However, when I do:

f = mod.GetFunction(“main”)
CHECK(f != nullptr);

in the C++ program, I get

terminate called after throwing an instance of 'dmlc::Error'
  what():  [11:38:05] conv2d_deploy_cpu.cc:43: Check failed: f != nullptr:

That is, mod.GetFunction is not able to retrieve any function with name “main”.

Does anybody know what I am missing and how I can get the signature and function name from a module built out of a relay function?

I really appreciate any help you can provide on this issue. I am tagging @masahi in this post since this seems to be related to

jmorrill · March 11, 2020, 8:51pm

I’m not the expert, but I thought the module function names were: run, get_input, get_output, etc…not “main”

For this, I use the GraphRuntime C++ api, which is quite easy to use once you get setting up the inputs and getting the outputs.

github.com

apache/incubator-tvm/blob/e5044cb9c2d247506bc5aa0e04ce65ea31077179/src/runtime/graph/graph_runtime.h

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */

/*!

This file has been truncated. show original

sergiomatiz · March 12, 2020, 9:09pm

Hi @jmorrill,

Thanks a lot for your reply. My guess is that I may have to use the tvm::runtime environment from C++ similar to what you can do in python when doing:

module = runtime.create(graph, lib, ctx)
module.set_input(input_name, data_tvm)
module.set_input(**params)

I was wondering if you know of some examples related to this issue that can be found on the web. So far I have not seen much on how to import the graph and parameters from C++. Any help would be greatly appreciated.

jmorrill · March 13, 2020, 3:11am

Not sure about the examples, I do something like this to load (hastily cherry picked and abridged from one of my projects)

mod_syslib = Module::LoadFromFile(library_path);
std::ifstream json_in(json_path);
if(json_in.fail())
{
    throw std::runtime_error("could not open json file");
}

const std::string json_data((std::istreambuf_iterator<char>(json_in)), std::istreambuf_iterator<char>());
json_in.close();
const std::vector<TVMContext> ctxs = { context };
graph.Init(json_data, mod_syslib, ctxs);
graph.LoadParams(params_file_loaded_into_string_or_a_dmlc_stream);

Then you should have a CPU based NDArray created from NDArray::Empty OR a plain old buffer that you will used for input data. Run() the graph, then graph the output. I threw this together from my head, but should get you started.

auto device_input_array = graph.GetInput(index);
device_input_array.CopyFromBytes(my_input_data_ptr,  input_size);
graph.Run();
auto device_output_array = graph.GetOutput(output_index);
device_output_array.CopyToBytes(my_ouput_ptr, output_size);

jmorrill · March 13, 2020, 3:15am

Also, a trick because ifstreams are so mega-slow with MSVC in debug build, I load the large params file with a mem mapped file:

I use this header only lib for mem map and load the params like this:

        std::error_code error;

        mio::mmap_sink rw_mmap = mio::make_mmap_sink(descriptor.params_path, 0, mio::map_entire_file, error);

        if(error)
        {
            throw std::runtime_error(error.message());
        }

        dmlc::MemoryFixedSizeStream stream(rw_mmap.begin(), rw_mmap.size());

        graph->Init(json_data, mod_syslib, ctxs);

sergiomatiz · April 7, 2020, 12:55pm

Thank you very much for your help. I am gonna give it a try using the GraphRuntime C++ API

jinfagang · April 6, 2022, 7:17am

what is graph here?? (i mean where does it created?)