hi, I am trying to autotune(with autotvm) a conv2d task for cuda target (nvidia A10). However, I run into issues when autotuning(use LocalRunner)
tvm version: 0.17
error:
terminate called after throwing an instance of 'tvm::runtime::InternalError'
what(): src/runtime/cuda/cuda_module.cc:61: CUDAError: cuModuleUnload(module_[i]) failed with error: CUDA_ERROR_MISALIGNED_ADDRESS
Stack trace:
0: tvm::runtime::CUDAModuleNode::~CUDAModuleNode()
at src/runtime/cuda/cuda_module.cc:61
1: tvm::runtime::SimpleObjAllocator::Handler<tvm::runtime::CUDAModuleNode>::Deleter_(tvm::runtime::Object*)
at include/tvm/runtime/memory.h:138
2: tvm::runtime::Object::DecRef()
at include/tvm/runtime/object.h:850
3: tvm::runtime::Object::DecRef()
at include/tvm/runtime/object.h:846
4: tvm::runtime::ObjectPtr<tvm::runtime::Object>::reset()
at include/tvm/runtime/object.h:455
5: tvm::runtime::ObjectPtr<tvm::runtime::Object>::~ObjectPtr()
at include/tvm/runtime/object.h:404
6: tvm::runtime::ObjectRef::~ObjectRef()
at include/tvm/runtime/object.h:519
Any Idea how to debug this?