TVM AOT C+BYOC compilation fails with `expected str but got Object`

Hi, we are working on a custom embedded NPU.

Goal is to use BYOC in the C runtime with an AOT executor, much like Running TVM on bare metal Arm(R) Cortex(R)-M55 CPU and Ethos(TM)-U55 NPU — tvm 0.9.dev0 documentation . I’ve created this very simple relay example to get something in the pipeline. We use our own fork of TVM, but I rebased on upstream yesterday and the issue is still the same.

tensor_shape = (3, 3, 3)
data_type = "int8"
a = relay.var("a", tvm.relay.TensorType(tensor_shape, data_type))
b = relay.var("b", tvm.relay.TensorType(tensor_shape, data_type))
total = np.product(tensor_shape)
constant_tensor = np.arange(0, total, dtype=data_type)
constant_tensor = np.reshape(constant_tensor, tensor_shape)
c = relay.const(constant_tensor, tvm.relay.TensorType(tensor_shape, data_type))
sum_expr = relay.add(relay.add(a, b), c)
module = tvm.ir.IRModule()
module = module.from_expr(sum_expr)

When I compile just relay.add(a,b) everything works fine, but if I add the constant, this error is triggered

Traceback (most recent call last):
  File "soma_codegen.py", line 81, in <module>
    compile_model(tvmc_model=model,
  File "/home/josse/phd/tvm-fork/python/tvm/driver/tvmc/compiler.py", line 307, in compile_model
    graph_module = relay.build(
  File "/home/josse/phd/tvm-fork/python/tvm/relay/build_module.py", line 468, in build
    graph_json, runtime_mod, params = bld_mod.build(
  File "/home/josse/phd/tvm-fork/python/tvm/relay/build_module.py", line 196, in build
    self._build(mod, target, target_host, executor, runtime, workspace_memory_pools, mod_name)
  File "/home/josse/phd/tvm-fork/python/tvm/_ffi/_ctypes/packed_func.py", line 237, in __call__
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  7: TVMFuncCall
        at /home/josse/phd/tvm-fork/src/runtime/c_runtime_api.cc:477
  6: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
        at /home/josse/phd/tvm-fork/include/tvm/runtime/packed_func.h:1204
  5: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::relay::backend::RelayBuildModule::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#3}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
        at /home/josse/phd/tvm-fork/include/tvm/runtime/packed_func.h:1200
  4: tvm::relay::backend::RelayBuildModule::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#3}::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const
        at /home/josse/phd/tvm-fork/src/relay/backend/build_module.cc:193
  3: tvm::relay::backend::RelayBuildModule::Build(tvm::IRModule, tvm::runtime::Map<tvm::Integer, tvm::Target, void, void> const&, tvm::Target const&, tvm::relay::Executor const&, tvm::relay::Runtime const&, tvm::WorkspaceMemoryPools const&, tvm::runtime::String)
        at /home/josse/phd/tvm-fork/src/relay/backend/build_module.cc:312
  2: tvm::relay::backend::RelayBuildModule::BuildRelay(tvm::IRModule, tvm::runtime::String const&)
        at /home/josse/phd/tvm-fork/src/relay/backend/build_module.cc:483
  1: tvm::codegen::CreateMetadataModule(std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tvm::runtime::NDArray, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, tvm::runtime::NDArray> > > const&, tvm::runtime::Module, tvm::runtime::Array<tvm::runtime::Module, void> const&, tvm::Target, tvm::relay::Runtime, tvm::relay::backend::ExecutorCodegenMetadata)
        at /home/josse/phd/tvm-fork/src/target/metadata_module.cc:95
  0: tvm::runtime::TVMRetValue::operator std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >() const
        at /home/josse/phd/tvm-fork/include/tvm/runtime/packed_func.h:823
  File "/home/josse/phd/tvm-fork/include/tvm/runtime/packed_func.h", line 823
TVMError: 
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------

  Check failed: type_code_ == kTVMStr (8 vs. 11) : expected str but got Object

I’ve tried going through it with a debugger, using the script provided here: https://github.com/Lunderberg/tvm-gdb-extension . I can set breakpoints and everything, but the provided debugger output is really difficult to work with (might be because I’m inexperienced with GDB), as is also mentioned in this post: Debugging libtvm.so

I also noticed that some files that were involved in the error don’t have any documentation yet on tvm: tvm::codegen Namespace Reference Is this supposed to be this way?

Any help for fixing this issue or helping me debug is greatly appreciated, currently I’m kind of shotgun debugging this, which is not a nice way to work.

Thank you very much!

Kindly pinging @areusch @Mousius @manupa-arm

hi @JosseVanDelm could you kindly share the revision of tvm you’re using? I may have broken something recently in https://github.com/apache/tvm/pull/10283.

How I would debug this: the error indicates there is a PackedFunc which metadata_module.cc expects should return a String, but instead it is returning Object. Consult metadata_module.cc on line 95 to see which function is returning the wrong data type.

1 Like

Hi @areusch,

Thanks for coming back to me so quickly! I was aware of the PR, and I hoped it changed the behaviour, but at first glance it did not seem to improve or exacerbate the issue.

Our current codebase (the initial one that failed) was rebased on top of commit 0009a308d82a321f2399923ab14b4c088461b4f2 (before PR 10283)

I also rebased last week on commit 1cf0c0a5bfa6b0c61ce253142b66f6235d694e07 (after PR 10283)

But I did not observe any change, so I’m afraid the error is on our side. I’ll try debugging myself with the pointers you provided and give you with more details as I go.

Best regards,

Josse

Hi @areusch,

The issue seems to stem from here?

        non_exportable_modules += pf_sym().operator std::string();

What happens on this line? I can not read what is happening here. Also, I can not call this line from the debugger:

      auto pf_sym = mod.GetFunction("get_symbol");
(gdb) call mod.GetFunction("get_symbol")
Too few arguments in function call.

Which seems to be correct if you look at source_module.cc?

Is this possible?

Anyway, i’ve tried to debug the packedfunc, but honestly It’s still very easy to get lost in all of these things and it’s really difficult to find out what’s going on for me. Everything is hidden behind tvm::runtime::Objects or tvm::runtime::ObjectRefs I don’t even know if the log below is the call that is causing the issue, but it shows what I’m currently looking at. I can’t make any sense of this…

(gdb) print args[0]
$24 = {<tvm::runtime::TVMPODValue_> = {value_ = {v_int64 = 93825008480720, v_float64 = 4.6355
713411087361e-310, v_handle = 0x5555564d31d0, v_str = 0x5555564d31d0 "\215", v_type = {code =
 208 '\320', bits = 49 '1', lanes = 22093}, v_device = {device_type = 1447899600, device_id =
 21845}}, type_code_ = 8}, <No data fields>}
(gdb) print args[1]
$25 = {<tvm::runtime::TVMPODValue_> = {value_ = {v_int64 = 93825011529344, v_float64 = 4.6355
714917307746e-310, v_handle = 0x5555567bb680, v_str = 0x5555567bb680 "\005", v_type = {code =
 128 '\200', bits = 182 '\266', lanes = 22139}, v_device = {device_type = 1450948224, device_
id = 21845}}, type_code_ = 8}, <No data fields>}
(gdb) print args[2]
$26 = {<tvm::runtime::TVMPODValue_> = {value_ = {v_int64 = 0, v_float64 = 0, v_handle = 0x0,
v_str = 0x0, v_type = {code = 0 '\000', bits = 0 '\000', lanes = 0}, v_device = {device_type
= 0, device_id = 0}}, type_code_ = 4}, <No data fields>}
(gdb) print args[3]
$27 = {<tvm::runtime::TVMPODValue_> = {value_ = {v_int64 = 93825009138880, v_float64 = 4.6355
713736261606e-310, v_handle = 0x555556573cc0, v_str = 0x555556573cc0 "\300\001", v_type = {co
de = 192 '\300', bits = 60 '<', lanes = 22103}, v_device = {device_type = 1448557760, device_
id = 21845}}, type_code_ = 8}, <No data fields>}
(gdb) print args[4]
$28 = {<tvm::runtime::TVMPODValue_> = {value_ = {v_int64 = 93825010182912, v_float64 = 4.6355
714252081951e-310, v_handle = 0x555556672b00, v_str = 0x555556672b00 "\304\001", v_type = {co
de = 0 '\000', bits = 43 '+', lanes = 22119}, v_device = {device_type = 1449601792, device_id
 = 21845}}, type_code_ = 8}, <No data fields>}
(gdb) print args[5]
$29 = {<tvm::runtime::TVMPODValue_> = {value_ = {v_int64 = 0, v_float64 = 0, v_handle = 0x0,
v_str = 0x0, v_type = {code = 0 '\000', bits = 0 '\000', lanes = 0}, v_device = {device_type
= 0, device_id = 0}}, type_code_ = 4}, <No data fields>}
(gdb) print args[6]
$30 = {<tvm::runtime::TVMPODValue_> = {value_ = {v_int64 = 140736746134272, v_float64 = 6.953
3191372424654e-310, v_handle = 0x7fffd3c29b00, v_str = 0x7fffd3c29b00 "tvmgen_default", v_typ
e = {code = 0 '\000', bits = 155 '\233', lanes = 54210}, v_device = {device_type = 3552746240
, device_id = 32767}}, type_code_ = 11}, <No data fields>}
(gdb)

Could you maybe elaborate on how I can find the function which is returning the wrong datatype in the function call?

Thanks!

Thanks for the additional information @JosseVanDelm ! It looks like one of the runtime::Module being exported in your compilation flow doesn’t implement get_symbol correctly. You might take a look at some gdb extensions developed by the community to help you inspect ObjectRef a bit more.

Here’s how i would debug this, though:

  • auto pf_sym = mod.GetFunction("get_symbol"); is looking up a PackedFunc inside the runtime::Module named mod. In effect his invokes a C++ function void GetFunction(std::string) on that Module.
  • GetFunction is expected to return a PackedFunc which is placed in pf_sym.
  • pf_sym() invokes the PackedFunc, passing 0 arguments (because PackedFunc args are always placed inside a C++ TVMArgs args array, the PackedFunc call will always have the same C++ signature unless implemented using TypedPackedFunc sugar).
  • To properly debug this, we need to find out which subclass of runtime::Module is in use here. The easiest way to do that is by inspecting ModuleNode::type_key(), which you should be able to do either with logging or from GDB: p mod->type_key().
  • From there, locate the GetFunction implementation for that Module. That function should contain a large if-else block:
    if (name == "get_const_var") {
      return PackedFunc([](TVMArgs args, TVMRetValue* rv) { ... });
    } else if (name == "get_...") { 
      return PackedFunc([](TVMArgs args, TVMRetValue* rv) { ... });
    } ...
    
  • The branch which implements get_symbol probably is returning a PackedFunc which itself doesn’t return a string. Fix that and I think it should unbreak you.

Hi Andrew,

Thank you so much for coming back to me and giving me the detailed explanation. I’ve installed the extra GDB commands you referenced, however, I’m still struggling to find out what’s going on here.

The module I’m debugging is a tvm::runtime::CSourceModuleNode inside of src/target/source_module.cc

I can step through the code, and I can see that it indeed gets in the giant if-else block. There it seems indeed a PackedFunc is constructed by a lambda function. However, I don’t understand how I can find which PackedFunc is being constructed or in which way this is constructed. It seems I’m struggling to grasp what the tvm runtime is doing or how a PackedFunc actually works.

The way I see it the PackedFunc is a C-style function in which you pass a long array of arguments of type TVMArgs and a TVM Return Value. Is there a way I can ask those arguments when I have pf_sym in the debugger? We are generating C code here, so I think the CSourceModules which were created in some previous Relay compile (?) step need to be converted into actual C code by Relay.Build?

I don’t get what the packedfunc at this point in the compilation progress is supposed to do? As I get it, it should just output C code in some string stream? No? Also, are PackedFuncs not solely for FFI calls? Do I need to enter in some python code here then? How can I find out which python function to run into? I would think the BYOC code handles the codegen process fine without any FFI calls? Also, should I implement my own GetFunction here? Cause this is not stated in the BYOC C codegen tutorial over here, (only JSON runtime?) right?

Sorry to raise all these questions, I’m clearly a bit clueless on what to do next, or what documentation to read…

Best regards and thanks!

hi @jossevandelm,

no problem–this part of the compiler is a bit confusing. though the thing you are debugging is runtime::Module, CSourceModuleNode is a compiler-only data structure that holds C source code which needs to be compiled before it can be run. See Introduce Artifact, a container for generated code for a discussion about why there are runtime::Module here.

However, I don’t understand how I can find which PackedFunc is being constructed or in which way this is constructed.

This can be confusing, and it’s pretty hard to identify the PackedFunc unless you either:

  1. determine the arguments to the GetFunction() call that produced it
  2. step into the PackedFunc and see where it jumps to

1 is probably easiest.

I don’t get what the packedfunc at this point in the compilation progress is supposed to do? As I get it, it should just output C code in some string stream? No?

In this case, get_symbol is not a function invoked at runtime–instead it’s invoked during compilation (as you see) to feed data into the ConstLoaderModule. The return value of get_symbol should mostly be null, unless, on initialization, a runtime::Module needs to deserialize some NDArray. If so, get_symbol can return the name of a PackedFunc in that runtime::Module which should be called (prefixed with __init_) with the values returned from get_const_vars (those values extracted at compile time). This use is a bit of an overloading of the runtime::Module interface.

In practice this all applies when using code with the TVM C++ runtime and we don’t currently export any runtime::Module to the C runtime which are expected to use this interface (although CSourceModuleNodeimplements it, I don't believe it's actually used by thec` backend).

Correct. There is only one load path for CSourceModule from the C++ runtime’s POV (export, compile, load_library). This makes it a bit confusing to understand how the compiler generates these, since runtime::Module looks like it’s meant for the runtime when it’s really not (immediately) meant for it.

You only need to implement GetFunction if you are creating a new subclass of ModuleNode. So, you shouldn’t for CSourceModuleNode. Based on reading the code, it seems like func_names[0] is actually a different ObjectRef than String there–so perhaps see if you can figure out why that is.

hope this helps! Andrew

Hi @areusch,

Thank you so much for taking the time to explain this to me, though I’m still struggling to get it to work…

Ok, I see, thanks for this background information!

I seem unable to use either method for debugging :frowning:

  1. at some point the code ends up in src/target/source/source_module.cc where it calls this constructor of the PackedFunc, IIUC:
      return PackedFunc(
          [sptr_to_self, this](TVMArgs args, TVMRetValue* rv) { *rv = this->func_names_[0]; });

However, I can not understand what this means? If I understand it correctly this creates a PackedFunc which just returns an object to a string which contains tvmgen_default_tvmgen_default_mybyoccodegen_main_0 if I access it like this in gdb:

(gdb) p *((StringObj*)(func_names_[0].data_))

If this is the case, why would you need to create a PackedFunc for this? Would this be the string that is actually returned as an object? If so, what would be the idiomatic way to convert the StringObj to a std::string?

I’ve also tried many times to call the pf_sym() function in the debugger, to inspect the output, but I still can not find out what to do (I cannot find a way to call these functions… I’ve even made my own code based on the MyAdd example in the TVM runtime documentation to see If I can somehow find out if I can look back into what the created PackedFunc is doing. Do you have any tips or pointers?

Thanks!

It could be a bit confusing as this uses a C++11-style closure. You may be seeing your debugger jump into the closure there. When the PackedFunc is being returned, GetFunction is manufacturing a callable function which has captured the this pointer so that when you then call the returned function, you’re invoking a “member PackedFunc” of the runtime::Module. This mechanism is a bit more convoluted than it might at first seem necessary, but such liveness is exploited in CUDAModule to provide a signal to indicate when a function should be passed to the CUDA compiler and loaded into GPU memory in preparation to be called.

Correct–invoking the PackedFunc returns a String. String is a wrapping class which is necessary to properly pass c++ strings to PackedFunc.

This is done because the C++ interface of runtime::Module specifies only a virtual PackedFunc GetFunction(std::string name) = 0, and all other interactions with a runtime::Module are via PackedFunc calls which then allows runtime::Module to be implemented in other languages e.g. Python if necessary.

Use the String class instead which is a glorified pointer to StringObj. It should define operator std::string, so simply:

String s = pf();
std::string my_string = s;

should work fine.

I’m not sure it will be especially easy to call PackedFunc from the debugger because the source relies on C++ templates to make the call easier. I’d suggest modifying the C++ source directly and using LOG(INFO) to print the results instead of trying to call w/ debugger.

Hi @areusch,

Thank you so much for coming back to me in such great detail. I’ve learned a lot by diving in the codebase, but I feel like right now is not the time for me to continue looking into this in this way, especially since there seems to be a lot of (exciting!) upstream work on this side and since we have still a lot of work on our library’s side.

I believe I found a fix after looking into the ARM CMISIS-NN code, but I’m not sure what the (future) implications are, so I was wondering if someone could tell me. We basically replaced these lines (source DNNL BYOC backend):

const auto* pf = runtime::Registry::Get("runtime.DNNLJSONRuntimeCreate");
ICHECK(pf != nullptr) << "Cannot find JSON runtime module to create";
auto mod = (*pf)(func_name, graph_json, params);
return mod;

With this line (source ARM CMSIS-NN BYOC backend)

return codegen::CSourceModuleCreate(code, "c", function_names);

It seems to me that CMSIS backend has it’s own way to deal with constants and that no registering of any runtime module happens. I think we can deal with the constants ourselves as well for now, but the runtime implications are not clear to me. The generated C code seems (mostly) correct, and I don’t think we need to export these modules with LLVM (and certainly not CUDA) anytime soon for our use case.

Would love to get some input from you @areusch or from the creators of the CMSIS-NN backend on this design decision and the implications for our BYOC backend. Thanks!

@JosseVanDelm I think if you did that (assuming you substitute code with graph_json), you 'd get the JSON output but you would be unable to run it properly since export_library would attempt to compile the JSON using a c compiler. Not sure if that quite answers your question, if you’re using export_model_library_format you wouldn’t see that behavior.

@areusch sorry for the confusion, I meant to reference these lines of the DNNL BYOC C backend , not the JSON backend:

    // Create a CSource module
    const auto* pf = runtime::Registry::Get("runtime.CSourceModuleCreate");
    ICHECK(pf != nullptr) << "Cannot find csource module to create the external runtime module";
    // TODO(@manupa-arm): pass the function names to enable system-lib creation
    return (*pf)(code, "c", Array<String>{sym}, variables);

Where 4 values are passed:

  • The actual C code
  • The fact that it needs to be passed on to a "c" compiler
  • The function names/symbols Array<String>{sym}
  • The constants : variables - Not in ARM CMSIS-NN?

IIUC const auto* pf = runtime::Registry::Get("runtime.CSourceModuleCreate"); gets a PackedFunc for creating (constructing?) a CSourceModule, is there a reason why this has to happen through a PackedFunc?

Is there a specific reason why the ARM CMSIS-NN has a separate pass for adding the constants?

BTW, I also tried to call with:

return codegen::CSourceModuleCreate(code, "c", syms, variables);

But this also failed with the same error as above :frowning:

Thank you very much! Best regards,

Josse

Ah no worries, this makes more sense.

I think this is mainly convention and a way to avoid #include "../../../../target/source/source_module.h".

These are two different styles of BYOC implementations–DNNL was added earlier than CMSIS-NN, and CMSIS-NN uses a different code-lowering path to take advantage of USMP I believe. DNNL offloads directly from the Relay graph, meaning the compiler doesn’t see any IR that might allow it to e.g. perform memory planning. CMSIS-NN uses the newly-added Relay-to-TIR and TIR-to-Runtime Target hooks. I believe it also modifies constants to adapt them to the format expected by CMSIS-NN.

I don’t think in this case it makes senes to pass variables to CSourceModuleCreate, because it’s not possible to use ConstLoaderModule in the C runtime. In any case, I think that error would be caused by something in variables not being a string.

The ARM CMSIS-NN TIR-to-Runtime hook seems to use the C++ binding. I think that should be fine for you to use (should be equivalent to passing variables as an empty list).

I’ve been building my own BYOC backend, following the docs on BYOC, i.e., heavily based on the codegen_c example.

It was working okay, but then I started getting the InternalError: Check failed: type_code_ == kTVMStr (8 vs. 11) : expected str but got Object in some cases.

I was able to solve it in some cases by removing (at random) some of my Relay transforms (of which I’ve got a few).

E.g., removing this allowed codegen to work for my single layer float32 model:

            # seq_wug = tvm.transform.Sequential(
            #     [
            #         transform.InferType(),
            #         transform.FoldConstant(),
            #     ]
            # )
            # with tvm.transform.PassContext(opt_level=3):
            #     mod = seq_wug(mod)

However, I’m still having this problem with my single layer int8 quantised model, even though it is following the same code path (I treat qnn.conv2d and nn.conv2d the same, and this is the only op I handle in my BYOC for now).

Are there any more recent tips on debugging this behaviour in TVM for BYOC?

Followed some of the tracing tips above (just logging, rather than gdb), and found a similar path of my issue.

In metadata_module.cc, in some cases auto pf_sym = mod.GetFunction("get_symbol"); was returning a pf_sym != nullptr, which caused my error to appear.

I think I fixed it by replacing:

return (*pf)(code, "c", Array<String>{func_names_}, const_names_);`

In my BYOC Finalize() function with:

return codegen::CSourceModuleCreate(code, "c", Array<String>{func_names_});

This did require adding an include #include "../../../../target/source/codegen_c_host.h"