Error when autotune conv2d

hi, I am trying to autotune(with autotvm) a conv2d task for cuda target (nvidia A10). However, I run into issues when autotuning(use LocalRunner)

tvm version: 0.17

error:

              terminate called after throwing an instance of 'tvm::runtime::InternalError'
  what():  src/runtime/cuda/cuda_module.cc:61: CUDAError: cuModuleUnload(module_[i]) failed with error: CUDA_ERROR_MISALIGNED_ADDRESS
  Stack trace:
  0: tvm::runtime::CUDAModuleNode::~CUDAModuleNode()
        at src/runtime/cuda/cuda_module.cc:61
  1: tvm::runtime::SimpleObjAllocator::Handler<tvm::runtime::CUDAModuleNode>::Deleter_(tvm::runtime::Object*)
        at include/tvm/runtime/memory.h:138
  2: tvm::runtime::Object::DecRef()
        at include/tvm/runtime/object.h:850
  3: tvm::runtime::Object::DecRef()
        at include/tvm/runtime/object.h:846
  4: tvm::runtime::ObjectPtr<tvm::runtime::Object>::reset()
        at include/tvm/runtime/object.h:455
  5: tvm::runtime::ObjectPtr<tvm::runtime::Object>::~ObjectPtr()
        at include/tvm/runtime/object.h:404
  6: tvm::runtime::ObjectRef::~ObjectRef()
        at include/tvm/runtime/object.h:519

Any Idea how to debug this?