Current main branch fails TensorRT unit test

MadFunMaker · April 9, 2021, 5:52pm

I found that the most recent TVM version fails the unit test of TensorRT. I don’t get why I have these failures given that the current version in the main branch passed the CI. I think the issue could be the version of DMLC-core or TensorRT. Does anybody have an idea as to why it happens? I tried the following command for that:

TVM_FFI=ctypes python -m pytest -v tests/python/contrib/test_tensorrt.py

'enabled targets: llvm; nvptx; llvm -device=arm_cpu; cuda pytest marker: ============================= test session starts ============================== platform linux – Python 3.6.10, pytest-6.2.1, py-1.10.0, pluggy-0.13.1 – /opt/anaconda3/envs/tvm_fleet/bin/python cachedir: .pytest_cache rootdir: /home/byungsoj/temp_local/tvm, configfile: pytest.ini collecting … collected 49 items

tests/python/contrib/test_tensorrt.py::test_tensorrt_simple PASSED [ 2%] tests/python/contrib/test_tensorrt.py::test_tensorrt_simple_cpu_io PASSED [ 4%] tests/python/contrib/test_tensorrt.py::test_tensorrt_not_compatible PASSED [ 6%] tests/python/contrib/test_tensorrt.py::test_tensorrt_serialize_graph_executor FAILED [ 8%] tests/python/contrib/test_tensorrt.py::test_tensorrt_serialize_vm FAILED [ 10%]’

This also seems to be related to this error I hit another time when I ran my own code to use TensorRT in TVM;

File “/home/byungsoj/backend-aware-graph-opt/package/backend_operator/target.py”, line 556, in measure_cost lib = relay.build(net, target=target, params=params) File “/home/byungsoj/tvm/python/tvm/relay/build_module.py”, line 283, in build graph_json, runtime_mod, params = bld_mod.build(ir_mod, target, target_host, params) File “/home/byungsoj/tvm/python/tvm/relay/build_module.py”, line 132, in build self._build(mod, target, target_host) File “/home/byungsoj/tvm/python/tvm/_ffi/_ctypes/packed_func.py”, line 237, in call raise get_last_ffi_error() tvm._ffi.base.TVMError: Traceback (most recent call last): 11: TVMFuncCall 10: _ZNSt17_Function_handlerIFvN3tvm 9: tvm::relay::backend::RelayBuildModule::GetFunction(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, tvm::runtime::ObjectPtrtvm::runtime::Object const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#3}::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const 8: tvm::relay::backend::RelayBuildModule::BuildRelay(tvm::IRModule, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, tvm::runtime::NDArray, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, tvm::runtime::NDArray> > > const&) 7: std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::relay::backend::GraphRuntimeCodegenModule::GetFunction(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, tvm::runtime::ObjectPtrtvm::runtime::Object const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#2}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&) 6: tvm::relay::backend::GraphRuntimeCodegen::Codegen(tvm::relay::Function) 5: tvm::relay::CompileEngineImpl::LowerExternalFunctions() 4: tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::runtime::ObjectRef const&)>::AssignTypedLambda<tvm::runtime::Module ()(tvm::runtime::ObjectRef const&)>(tvm::runtime::Module ()(tvm::runtime::ObjectRef const&), std::__cxx11::basic_string<char, std::char_traits, std::allocator >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*) const 3: tvm::relay::contrib::TensorRTCompiler(tvm::runtime::ObjectRef const&) 2: tvm::relay::backend::contrib::JSONSerializer::Save(dmlc::JSONWriter*) 1: tvm::runtime::json::JSONGraphNode::Save(dmlc::JSONWriter*) 0: dmlc::json::Handlerdmlc::any::Write(dmlc::JSONWriter*, dmlc::any const&) File “/home/byungsoj/tvm/3rdparty/dmlc-core/include/dmlc/json.h”, line 590 TVMError: Check failed: it != nmap.end() && it->first == id == false: Type St6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS5_EE has not been registered via DMLC_JSON_ENABLE_ANY

comaniac · April 9, 2021, 6:02pm

@trevor-m do you have any clue?

alopez_13 · April 9, 2021, 7:09pm

I just tried:

TVM_FFI=ctypes python3 -m pytest -v tests/python/contrib/test_tensorrt.py

And all tests passed. That said I did git submodule update --init --recurse yesterday so maybe not all submodules have been updated in your setup?

Hope that helps.

MadFunMaker · April 9, 2021, 9:07pm

Thank you for your reply! I made sure that I have updated submodules. Could you tell me which Docker image did you try with? Was it Dockerfile.ci_gpu? And could you share your submodule versions from “git submodule” command?

alopez_13 · April 9, 2021, 9:53pm

I am not running docker. I just cloned the project, updated the submodules, changed to the project directory and issued the command. So basically:

git clone --recursive https://github.com/apache/tvm tvm
cd tvm
TVM_FFI=ctypes python3 -m pytest -v tests/python/contrib/test_tensorrt.py

Now, if you already have your copy of the repo then pull the latest submodules: git submodule update --init --recurse

I do use docker for TVM related projects, but not for testing TVM itself. If you are concerned about messing your system by installing global pip libraries you can use virtual environments. That is what I do when I run outside of docker and work on TVM.

Re-reading your problem, maybe the issue is with the docker container you are using?