Hi everyone, Im hoping for some guidance on what I might be doing wrong. What im trying to do is tune multiple operators/shapes sequentially in a script using auto_scheduler.
My code looks roughly as follows:
for shape in shapes_list:
tune_option = auto_scheduler.TuningOptions(
num_measure_trials=num_trials,
num_measures_per_round=num_measures_per_round,
verbose=2,
measure_callbacks=[
auto_scheduler.RecordToFile(
"outputs/" + operator.__name__ + str(compute_args) + ".json"
)
],
)
cost_model = XGBModel()
s_policy = SketchPolicy(task, cost_model, params=sketch_policy_params)
sch, args = auto_scheduler.auto_schedule(
task, search_policy=s_policy, tuning_options=tune_option
)
When I run multiple shapes in this loop the first shape succeeds to find schedules without a problem but all subsequent shapes have this behavior that they encounter numerous measure timeout errors. See this log that runs two shapes of gmm back to back:
----------------------------------------------------------------------
------------------------------ [ Search ]
----------------------------------------------------------------------
Generate Sketches #s: 3
Number of sketches generated: 3
Init Population, sketch select strategy: random, total sketches: 3
bounds: Sample Initial Population #s: 1949 fail_ct: 27 Time elapsed: 0.39
GA Iter: 0 Max score: 1.0000 Min score: 0.9387 #Pop: 128 #M+: 0 #M-: 0
GA Iter: 5 Max score: 1.0000 Min score: 0.9903 #Pop: 128 #M+: 1438 #M-: 83
GA Iter: 10 Max score: 1.0000 Min score: 0.9944 #Pop: 128 #M+: 1567 #M-: 90
EvolutionarySearch #s: 128 Time elapsed: 3.26
----------------------------------------------------------------------
------------------------------ [ Measure ]
----------------------------------------------------------------------
Get 10 programs to measure.
Gathering gemm shapes
op: batch_matmul_nkkm compute arg: (1, 128, 128, 128)
..........**********==================================================
.
.
.
----------------------------------------------------------------------
------------------------------ [ Search ]
----------------------------------------------------------------------
Generate Sketches #s: 3
Number of sketches generated: 3
Init Population, sketch select strategy: random, total sketches: 3
bounds: Sample Initial Population #s: 1966 fail_ct: 22 Time elapsed: 0.46
GA Iter: 0 Max score: 0.9984 Min score: 0.9365 #Pop: 128 #M+: 0 #M-: 0
GA Iter: 5 Max score: 1.0000 Min score: 0.9889 #Pop: 128 #M+: 1436 #M-: 79
GA Iter: 10 Max score: 1.0000 Min score: 0.9938 #Pop: 128 #M+: 1571 #M-: 86
EvolutionarySearch #s: 128 Time elapsed: 3.18
----------------------------------------------------------------------
------------------------------ [ Measure ]
----------------------------------------------------------------------
Get 10 programs to measure.
Execution time of this operator: 0.021 ms
op: batch_matmul_nkkm compute arg: (1, 512, 32, 512)
.........*T*T*T*T*T*T*T*T*T==================================================
You can see that all the schedules measured for this (1, 512, 32, 512) shape run into timeout issues.
But if I run just this shape (or i run this shape as the first shape searched) then all of the schedules will succeed as follows (note this is a separate run):
----------------------------------------------------------------------
------------------------------ [ Search ]
----------------------------------------------------------------------
Generate Sketches #s: 3
Number of sketches generated: 3
Init Population, sketch select strategy: random, total sketches: 3
bounds: Sample Initial Population #s: 1951 fail_ct: 21 Time elapsed: 0.39
GA Iter: 0 Max score: 0.9997 Min score: 0.9417 #Pop: 128 #M+: 0 #M-: 0
GA Iter: 5 Max score: 1.0000 Min score: 0.9898 #Pop: 128 #M+: 1438 #M-: 74
GA Iter: 10 Max score: 1.0000 Min score: 0.9948 #Pop: 128 #M+: 1565 #M-: 81
EvolutionarySearch #s: 128 Time elapsed: 3.31
----------------------------------------------------------------------
------------------------------ [ Measure ]
----------------------------------------------------------------------
Get 10 programs to measure.
Gathering gemm shapes
op: batch_matmul_nkkm compute arg: (1, 512, 32, 512)
..........**********==================================================
You can note that im using the default builder and runner.
Is there some state clean up i need to be doing between subsequent calls to auto_scheduler.auto_schedule that im not doing or can anyone indicate a direction to fix this behaviour?