No CUDA-capable device is detected

Hi I am running into issue of no CUDA-capable device is detected even after using G4DN AWS instance. I am trying to run the code in a container which have cuda, cudnn installed, still gives ‘no CUDA-capable device is detected’.

nvcc reports runtime (toolkit) version 10.2.130:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

Error logs -

  File "/usr/lib64/python3.7/site-packages/tvm-0.8.dev1452+g338940dc5-py3.7-linux-x86_64.egg/tvm/contrib/graph_executor.py", line 66, in create
    return GraphModule(fcreate(graph_json_str, libmod, *device_type_id))
  File "tvm/_ffi/_cython/./packed_func.pxi", line 323, in tvm._ffi._cy3.core.PackedFuncBase.__call__
  File "tvm/_ffi/_cython/./packed_func.pxi", line 267, in tvm._ffi._cy3.core.FuncCall
  File "tvm/_ffi/_cython/./base.pxi", line 163, in tvm._ffi._cy3.core.CALL
tvm._ffi.base.TVMError: Traceback (most recent call last):
  8: TVMFuncCall
  7: _ZNSt17_Function_handlerI
  6: tvm::runtime::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) const [clone .isra.748]
  5: tvm::runtime::GraphExecutorCreate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::Module const&, std::vector<DLDevice, std::allocator<DLDevice> > const&, tvm::runtime::PackedFunc)
  4: tvm::runtime::GraphExecutor::Init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::Module, std::vector<DLDevice, std::allocator<DLDevice> > const&, tvm::runtime::PackedFunc)
  3: tvm::runtime::GraphExecutor::SetupStorage()
  2: tvm::runtime::NDArray::Empty(tvm::runtime::ShapeTuple, DLDataType, DLDevice, tvm::runtime::Optional<tvm::runtime::String>)
  1: tvm::runtime::DeviceAPI::AllocDataSpace(DLDevice, int, long const*, DLDataType, tvm::runtime::Optional<tvm::runtime::String>)
  0: tvm::runtime::CUDADeviceAPI::AllocDataSpace(DLDevice, unsigned long, unsigned long, DLDataType)
  File "/opt/amazon/tvm/src/runtime/cuda/cuda_device_api.cc", line 117
TVMError: 
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (e == cudaSuccess || e == cudaErrorCudartUnloading) is false: CUDA: no CUDA-capable device is detected

nvidia-smi command result →

Mon May 23 22:36:54 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   49C    P0    29W /  70W |   4470MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

ldd /usr/lib64/python3.7/site-packages/tvm-0.8.dev1452+g338940dc5-py3.7-linux-x86_64.egg/tvm/libtvm.so | cat gives -

 linux-vdso.so.1 (0x00007ffc559e4000)
 libnvrtc.so.10.2 => /usr/local/cuda/lib64/libnvrtc.so.10.2 (0x00007fbd1996a000)
 libLLVM-11.so => /usr/lib64/libLLVM-11.so (0x00007fbd143f9000)
 libdl.so.2 => /usr/lib64/libdl.so.2 (0x00007fbd141f5000)
 libcudart.so.10.2 => /usr/local/cuda/lib64/libcudart.so.10.2 (0x00007fbd13f77000)
 libcuda.so.1 => /usr/local/cuda/lib64/libcuda.so.1 (0x00007fbd13d6b000)
 libcudnn.so.8 => /usr/lib64/libcudnn.so.8 (0x00007fbd13b44000)
 libcublas.so.10 => /usr/local/cuda/lib64/libcublas.so.10 (0x00007fbd0f88d000)
 libcublasLt.so.10 => /usr/local/cuda/lib64/libcublasLt.so.10 (0x00007fbd0d9f8000)
 libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x00007fbd0d7da000)
 librt.so.1 => /usr/lib64/librt.so.1 (0x00007fbd0d5d2000)
 libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007fbd0d250000)
 libm.so.6 => /usr/lib64/libm.so.6 (0x00007fbd0cf10000)
 libgcc_s.so.1 => /usr/lib64/libgcc_s.so.1 (0x00007fbd0ccfa000)
 libc.so.6 => /usr/lib64/libc.so.6 (0x00007fbd0c94f000)
 /lib64/ld-linux-x86-64.so.2 (0x00007fbd1d7a8000)
 libffi.so.6 => /usr/lib64/libffi.so.6 (0x00007fbd0c747000)
 libedit.so.0 => /usr/lib64/libedit.so.0 (0x00007fbd0c50a000)
 libz.so.1 => /usr/lib64/libz.so.1 (0x00007fbd0c2f5000)
 libtinfo.so.6 => /usr/lib64/libtinfo.so.6 (0x00007fbd0c0ca000)

I am unable to trace the line of reasoning being followed by this error message: no CUDA-capable device is detected. How i can debug it further or it might be a bug in tvm?