[AutoTVM] Does GraphTuner support running on GPUs? And performance compared to TensorRT?

Hello! I see from this issue that GraphTuner achieves speed up against MXNet and TVM solely with the schedule tuner:

I wonder if GraphTuner supports running on GPUs? And if so, are there any benchmarking results between GraphTuner and TensorRT?

@kevinthesun

In current TVM, normal gpu conv2d schedules keeps data layout as the original NCHW. As a result, there is no graph level data layout transformation coming from different schedules, and graph tuning is not required in this case. However, if in the future we develop a more efficient gpu schedule template which does require data layout other than NCHW, graph tuning might help.

Thank you for your reply!