Autotune network with my own kernels

Would you like to write your customized schedule and autotune it? If so, you can add a new template key for both compute and schedule: https://github.com/dmlc/tvm/blob/master/topi/python/topi/cuda/conv2d.py#L31 and https://github.com/dmlc/tvm/blob/master/topi/python/topi/cuda/conv2d.py#L119-L120.

Then while tuning, just create task with this template key:

tsk = autotvm.task.create(tasks[i].name, tasks[i].args, tasks[i].target, tasks[i].target_host, new_template_key)

Similar to winograd.