Hi, I am referring to C++ deployment instructions to deploy a compiled NNVM graph on my laptop GPU using C++. When set device_type to kDLOpenCL I get Segfault after reading the input from a binary file into DLTensor as in the same example. Here is the setup
Compilation
-----------------
Development PC : x86_64 with NVIDIA 920M
TVM runtime is compiled with OpenCL and CUDA enabled.
NNVM graph is built with target = 'opencl', target_host = 'llvm'
---------------------------------------------------------------------------------
Deployment
----------------
int dtype_code = kDLFloat;
int dtype_bits = 32;
int dtype_lanes = 1;
int device_type = kDLOpenCL;
int device_id = 0;
The source code that is causing Segfault is
data_fin.read(static_cast<char*>(x->data), 3 * 224 * 224 * 4);
On the same PC, I compiled the graph for the CPU (target=‘llvm’, target_host=‘llvm’) and I am able to deploy the exported module using C++ with device_type = kDLCPU. The segfault occurs when deploying on GPU. Below is the log
[15:01:55] src/runtime/opencl/opencl_device_api.cc:231: Multiple OpenCL platforms matched, use the first one ...
[15:01:55] src/runtime/opencl/opencl_device_api.cc:234: Initialize OpenCL platform 'NVIDIA CUDA'
[New Thread 0x7ffff3923700 (LWP 25361)]
[New Thread 0x7ffff3122700 (LWP 25362)]
[New Thread 0x7ffff2921700 (LWP 25363)]
[New Thread 0x7ffff2120700 (LWP 25364)]
[New Thread 0x7ffff191f700 (LWP 25365)]
[New Thread 0x7ffff111e700 (LWP 25366)]
[New Thread 0x7ffff091d700 (LWP 25367)]
[15:01:55] src/runtime/opencl/opencl_device_api.cc:259: opencl(0)='GeForce 920M' cl_device_id=0x6801a0
Thread 1 "ssd_nnvm_demo" received signal SIGSEGV, Segmentation fault.
0x000000000040acf5 in tvm::runtime::Module::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool) ()
Is there any different way to read the data from a binary file into the input DLTensor that is allocated on GPU?
Thanks.