Hi:
I am learning TVM for GPU backend. I have a question after reading code_gen. It seems that the workflow of GPU backend is that TVM first generates CUDAcode and then call NVRTC to generate PTX code directly. However, for the tutorial of Tensor Core(https://docs.tvm.ai/tutorials/optimize/opt_conv_tensorcore.html). TVM will use NVCC to compile CUDA code instead of use NVRTC to generate PTX code .
I am quite confused about the difference of these two workflows…