Cross-compile app on X86 for Nvidia AGX Xavier with cuda

Hello everyone,

I am trying to use my compiled NN on an AGX board with C++ api. I have deployed the NN on my x86 machine and it works fine.

I am now trying to deploy it on AGX. I compiled the NN with command tvmc.compile(model, target="cuda -arch=sm_72",target_host='llvm -mtriple=aarch64-linux-gnu',cross='aarch64-linux-gnu-gcc',package_path="panda_controller.tar") on my x86 machine and then compile it with gcc/g++. Everything works fine if I compile the app on the AGX machine.

But when I try to compile the app on x86 and run on agx, I got following errer:

terminate called after throwing an instance of 'tvm::runtime::InternalError'
  what():  [00:05:07] /home/tc/tvm/src/runtime/library_module.cc:118: Binary was created using {cuda} but a loader of that name is not registered. Available loaders are GraphRuntimeFactory, metadata_module, metadata, GraphExecutorFactory, const_loader, AotExecutorFactory, VMExecutable. Perhaps you need to recompile with this runtime enabled.
Stack trace:
  0: tvm::runtime::LoadModuleFromBinary(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, dmlc::Stream*)
  1: tvm::runtime::ProcessModuleBlob(char const*, tvm::runtime::ObjectPtr<tvm::runtime::Library>, std::function<tvm::runtime::PackedFunc (int (*)(TVMValue*, int*, int, TVMValue*, int*, void*), tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)>, tvm::runtime::Module*, tvm::runtime::ModuleNode**)
  2: tvm::runtime::CreateModuleFromLibrary(tvm::runtime::ObjectPtr<tvm::runtime::Library>, std::function<tvm::runtime::PackedFunc (int (*)(TVMValue*, int*, int, TVMValue*, int*, void*), tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)>)
  3: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
  4: tvm::runtime::Module::LoadFromFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
  5: main
  6: __libc_start_main
  7: 0x000000557a4136c3

My cuda version is 10.2 for both machine, and the .tar file I used is the same.

Can anyone help me with it?