It’s possible if you set the same tuning time for both platforms, as the tuning space for GPU is usually much larger than CPU. If you want, you could dive into the scheduling and tuning space implementation in TOPI and refine the tuning space to make it more efficient. For example, here is the conv2d implementation in TOPI.
In addition, another straightforward solution to achieve high performance on GPU is enabling cuDNN. To do so, first make sure you set USE_CUDNN to ON when building TVM, and then specify the target as “cuda -libs=cudnn”. The detail can be found here.