Hi all,
So far TVM is very powerful in generating kernels for different backends. But there are still some drawbacks in TVM stack. For example,
- Current kernels in TOPI are only optimized for specific shapes and devices
- Writing high performance schedule is not easy
An auto-tuner can help to solve these problems. Here, we wrote a technical report to show our exploration on bring an autotuner to TVM. https://arxiv.org/abs/1805.08166.
Our auto-tuner is backed by machine learning techniques. It is quite interesting that we use machine learning to optimize machine learning itself.
By using the auto-tuner, we needn’t write and tune many if ... else ...
in schedule to set parameters for different shapes. The auto-tuner can also find good kernels for your specific devices.
Furthermore, we can even derive schedule code from tvm.compute
directly. Deriving schedule from compute (or auto-schedule) is tested on cuda for some operators. We still need more experiments to improve it.
We are cleaning up the APIs and plan to publish the code in several weeks. Any comment is welcomed.