Confused about kMaxNumGPUs in runtime

I am studying the source code of TVM, and I am confused about the constant kMaxNumGPUs (=32) in /src/runtime/cuda/cuda_module.h.
To my mind, when we run the compiled model, we can only choose 1 GPU card. If this is true, why TVM runtime set the kMaxNumGPUs to 32 and keeps the memory allocation status of kMaxNumGPUs GPU cards ?

Any help ? Besides this, I am also confused about the multi-thread runtime. Obviously the runtime uses C++ threads, but how cuda kernel launches says tvm supports no runtime concurrency. @masahi Could you give me some tips ?

We use threads for parallelism within an operator. By “concurrency” I meant something like asynchronous execution among operators (also called inter-operator parallelism).

Now I understand. Thank you.