Thanks @luchangli for asking!
Tuning speed and kernel performance could be improved in several directions, and I believe our system paves the path for them:
- improve the system to allow faster search
- a better cost model or search algorithm
- more schedule primitives, like software pipelining and tensorization