Ansor uses its XGBoost based cost model in an advanced manner. Each prediction is a sum of several XGBoost calls. To train the model, a “pack-sum” loss is used.
Training a cost model in this way seems interesting. Can anyone explain the mechanism in detail? The only thing I got is the source code (tvm/xgb_model.py at main · apache/tvm · GitHub)
I have read the paper and the comments. What I don’t understand is how we train a model with a sum as a ground truth. There is a custom callback function registered to XGBoost, and I have not figured out how it works.