[BYOC] no such function in module

Hi, I have been struggling quite a bit the BYOC example working.

You can follow my progression:

  1. [BYOC] details about BYOC (newbie)
  2. [BYOC] Failed to find the codegen tool for

I am now using the codegen_c example from contrib, haven’t touched it really, just trying to get it to execute at runtime.

I have added operator registrations into python/tvm/relay/op/contrib/ccompiler.py

Here is the code I am executing:

import numpy as np

import tvm
from tvm import relay as R
from tvm.contrib import graph_executor


# build IR
dim = (2, 2)
x = R.var('x', shape=dim)
y = R.var('y', shape=dim)
output = R.multiply(x, y)

params = {}

# Set the TVM build target
target = "llvm"

func = R.Function(R.analysis.free_vars(output), output)
func = R.build_module.bind_params_by_name(func, params)
mod = tvm.IRModule()
mod["main"] = func

from tvm.relay.op.contrib.ccompiler import partition_for_ccompiler
pmod = partition_for_ccompiler(mod)
print(f'PMOD\n{pmod}')
lib = R.build(pmod, target, params=params)

# Generate graph executor
dev = tvm.device(target, 0)
m = graph_executor.GraphModule(lib["default"](dev))

dtype = 'float32'
x = np.array([
                 [1, 1],
                 [1, 1]
             ], dtype=dtype)
y = np.array([
                 [2, 2],
                 [2, 2]
             ], dtype=dtype)

m.set_input('x', tvm.nd.array(x))
m.set_input('y', tvm.nd.array(y))
m.run()
output = m.get_output(0)

print(f'Output:\n{output}')

Honestly, I am not quite sure of what I am doing here… I did my best to put the pieces together.

Here is the code for partition_for_ccompiler:

def partition_for_ccompiler(mod, params=None):
    if params:
        mod["main"] = bind_params_by_name(mod["main"], params)
    seq = tvm.transform.Sequential(
        [
            transform.CanonicalizeOps(),
            transform.InferType(),
            transform.SimplifyInference(),
            transform.FoldConstant(),
            transform.FoldScaleAxis(),
            # fold consecutive add ops to simplify pattern `conv2d-bias_add-bn-relu`
            transform.SimplifyExpr(),
            transform.FoldConstant(),
            # transform.MergeComposite(pattern_table()),
            transform.AnnotateTarget("ccompiler"),
            transform.MergeCompilerRegions(),
            transform.PartitionGraph(),
        ]
    )
    with tvm.transform.PassContext(opt_level=3):
        mod = seq(mod)
    return mod

And this is the error:

Traceback (most recent call last):
  File "tests/slai/relay_multiply.py", line 33, in <module>
    m = graph_executor.GraphModule(lib["default"](dev))
  File "/Users/.../tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 237, in __call__
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (8) 9   libffi.8.dylib                      0x0000000103ac974c ffi_call_int + 1208
  [bt] (7) 8   libffi.8.dylib                      0x0000000103acc04c ffi_call_SYSV + 76
  [bt] (6) 7   libtvm.dylib                        0x000000011d9d503c TVMFuncCall + 60
  [bt] (5) 6   libtvm.dylib                        0x000000011da91198 tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::GraphExecutorFactory::GetFunction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::$_0> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) + 328
  [bt] (4) 5   libtvm.dylib                        0x000000011da8bd5c tvm::runtime::GraphExecutorFactory::ExecutorCreate(std::__1::vector<DLDevice, std::__1::allocator<DLDevice> > const&) + 264
  [bt] (3) 4   libtvm.dylib                        0x000000011da77f28 tvm::runtime::GraphExecutor::Init(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, tvm::runtime::Module, std::__1::vector<DLDevice, std::__1::allocator<DLDevice> > const&, tvm::runtime::PackedFunc) + 456
  [bt] (2) 3   libtvm.dylib                        0x000000011da7a418 tvm::runtime::GraphExecutor::SetupOpExecs() + 1780
  [bt] (1) 2   libtvm.dylib                        0x000000011da7f6f8 tvm::runtime::GraphExecutor::CreateTVMOp(tvm::runtime::TVMOpParam const&, std::__1::vector<DLTensor, std::__1::allocator<DLTensor> > const&) + 1228
  [bt] (0) 1   libtvm.dylib                        0x000000011c116d38 tvm::runtime::detail::LogFatal::Entry::Finalize() + 84
  File "/Users/.../tvm/src/runtime/graph_executor/graph_executor.cc", line 529
TVMError:
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (pf != nullptr) is false: no such function in module: tvmgen_default_ccompiler_main_0

If anyone can help me getting BYOC to work I will be massively thankful!

Hi Nicolas,

did you make any progress regarding this topic? I’m in the same process to get an example of the codegenc running. I think the problem you facing started earlier in the process.

I just found an entry in the Debug-Log in te_compiler.cc. As far as I understood the External Compiler is looked up at that part of the te_compiler.cc from the global register.

It then seems to be executed in Line 211:

I checked inside the codegen.cc file that the c-code ist generated properly (by print debubbing)

however it looks like the module is not properly created because the following check in line 218 in the te_compiler fails

I found the following line the the my debug log, that indicates that the function is not properly set up in the runtime::Module.

[15:25:57] /home/friedrich/tvm-ims/src/relay/backend/te_compiler.cc:221: Build: GraphExecutorCodegen: LowerTE: relay.ext.my_target: Unable to find definition for the external function ‘tvmgen_default_my_target_main_0’ in the runtime module generated by external codegen ‘my_target’

(Just as an additional info: I set up a template structure for testing purposes and called the new target “my_target”, however it contains the same code generator as the codegenc)

In case you made further progress, I would be glad if we could share some ideas. I also would be delighted if someone more experienced could share his/her thoughts.

Best regards,

Martin

1 Like

Hi Martin,

No, sorry no progress on my side. This is standby for me right now and I am also waiting for some help from TVM veterans.

I will be happy to share any progress once I made some.

Hi Nicolas, I agree it would be great if somebody from the TVM veterans could explain how to finalize the step to get the modules runing. What I understood in the previous days is that the Codegen is started within the te_compiler. Inside the function LowerExternalFunctions the global registered MyTargetCompiler is loaded and the code generation is executed per external module. The module is then complemented by adding the new external_mods and the device context and returned as an updated_module. For me it looks like inside the modules the code is properly generated.

From all the documents, forum entries etc. I do not understand:

  • Where and how should the modules be compiled? => The Blog entry (How to Bring Your Own Codegen to TVM , Headline " C Source Compilation") explains that the modules need to be compiled and gives a code sniped. What is the proper place to add this compiling step?
  • So far we partitioned an operational graph for operatins that should be run by the accelerator (BYOC) and parts that should be run by the default target (e.g. a host CPU). How do we tie those parts together in the runtime? Is there any example that I can use to understand this step or could anyone please explain?

(@slai-nick: I send you a DM a couple of days ago, did you read it?)

Hi, just came across this thread, sorry for the delay. Not a vetran, but have worked through many of the BYOC integrations and was the one who inserted the log entry you refer.

Perhaps the DNNL library integration would be a good example to look at? The ‘relay.ext.dnnl’ external codegen function is implemented so as to return a runtime::Module which correctly binds the symbol in src/relay/backend/contrib/dnnl/codegen.cc. Note the implicit chaining of code generation with code building. Also tests/python/contrib/test_dnnl.py has an example partition function illustrating the boilerplate passes to express the desired partitioning before the rest of TVM compilation proceeds.

Note the ‘UMA’ project intends to make these implicit APIs more structured, which I think is a very welcome improvement: https://github.com/apache/tvm-rfcs/pull/60

Best, -Mark

1 Like

Hi Martin,

It’s been quite a while but any update? I recently started working on a BYOC project and was stuck on the exact same problem!

Thanks,

Best