[AutoScheduler] Do we have plan to support auto schedule ExternOp?

I spent a lot of time optimizing the sort/argsort kernel for GPUs, we get pretty good performance on GPUs from multiple vendors that competes with those vendor’s hand tuned libraries.

If these TIR kernels are well optimized, they shouldn’t end up being the bottleneck in models.