According to the paper “Learning to Optimize Tensor Programs”, it seems that Bayesian Optimization is not a good choice as a tuner because of the reasons shown below.
- Uncertainty estimation was not as important in autotuning problem, possibly because the models were trained with more training samples than traditional hyper-parameter optimization problems.
- Configuration space s is not invarient which makes Bayesian Optimization not working on transfer learning.
- The cost of compiling and running a tensor program is only a few seconds,which is fast than the traditional hyper-parameter optimization problems.
Am I correct? I took screenshots of the paragraphs in the paper.
So Bayesian Optimization do not work well on auto tuning tasks, why It was mentioned in the last section of the paper?