[RFC] AdaTune: Bring adaptive features to the auto-tuner

limenghao · December 14, 2020, 9:47am

Motivation

The AutoTVM module is very useful to bring lower inference latency through the auto-tuning process. However, the auto-tuning usually takes a long time to finish, especially when the task space is really large. With some preliminary studies, we found there are two limitations in the current auto-tuning pipeline.

To collect accurate GFLOPS numbers when evaluating a config on the given hardware target, the measurement usually repeats a fixed number of times, which can be really time-consuming. We would like to reduce the hardware measurement cost while obtaining accurate measurement.
The ModelBasedTuner of the autoTVM module uses a cost model and an optimizer to help choose promising configs to do the next-batch searching. An epsilon-greedy strategy with a fixed epsilon=0.05 is used here to avoid pure exploitation. However, such a value may cause the search to be overly greedy and sometimes trapped in a local optimal. We would like to dynamically balance the trade-off between exploration and exploitation, escaping from local optimum.
```
if self.trial_pt >= len(self.trials) - int(0.05 * self.plan_size):
    # if the trial list is empty or
    # the tuner is doing the last 5% trials (e-greedy), choose randomly
    index = np.random.randint(len(self.space))
    while index in self.visited:
        index = np.random.randint(len(self.space))
```

Solution

Use an adaptive evaluator to obtain an accurate estimate of the GFLOPS number with significantly reduced cost. We early stops the repeating evaluation on one config on the target when the collected data is reliable enough.
Use an uncertainty-aware tuner. Intuitively, when searching the config space, we want it to do more exploitation when the cost model predicts accurately and does more exploration when the cost model doesn’t do a good job. And the uncertainty of the cost model can be taken into consideration when setting the epsilon value (which controls the amount of exploration-vs-exploitation) dynamically.

Implementation

For the adaptive evaluator, we add an argument in the measure_option named enable_adaptive_evaluator. When it is set to true, the evaluation will be conducted in partitioned micro-batches one-by-one. When the coefficient of variation among the micro-batches reach a threshold, it early stops the evaluation and returns the mean cost.
Add a new tuner named RFEITuner in the autotvm.tuner. The RFEITuner inherited from the ModelBasedTuner and uses a new cost model named RFEICostModel. This cost model uses a RandomForestRegressor rather than XGBoost, and replaces the cost estimate part with expected positive improvement.
Add an argument in ModelBasedTuner named uncertainty_aware, and a dynamic_ep is initialized to replace the former fixed epsilon 0.05. This dynamic_ep will be updated dynamically based on the uncertain from the cost model.

Todos

Send the PR.

Our work has almost finished and we will send the PR later, more detailed implementation can be obtained in the paper AdaTune-NIPS2020. Any feedback on the design or implementation is welcomed.

tqchen · December 14, 2020, 1:53pm

cc @merrymercy @junrushao

junrushao · December 14, 2020, 6:04pm

It is definitely interesting! Definitely will read paper

comaniac · December 14, 2020, 7:28pm

The RFC looks interesting and I’m looking forward to seeing more evaluations with AdaTune after upstreaming

In terms of the implementations, there are my two cents for the API design.

The name enable_adaptive_evaluator is vague. Since its behavior is similar to min_repeat_ms which repeats the evaluations until the total runtime exceeds the minimum requirements, could this be combined with that?
uncertainty_aware and dynamic_ep` are also vague. We probably need better naming.

limenghao · December 15, 2020, 3:40am

Thanks very much for your all attention and the naming suggestions.

For the enable_adaptive_evaluator, I agree it’s more reasonable to set the threshold value which early terminates the repeat with the args like min_repeat_ms does. How about replacing it with a min_converge_coef?
I meant to show the dynamic feature in the variables. And the key change is to replace the fixed 5% randomization when do the e-greedy. Setting the uncertainty_aware is to make this dynamic-searching an optional choice as some cost models are not supported. How about a fixed_uncertainty(Fasle for AdaTune) and a randomization_rate? Or you some suggestions? Thanks!

tqchen · February 23, 2021, 4:18pm

@comaniac @junrushao @merrymercy @limenghao would be great to revive the thread and get the adaptive tuning into the mainline