[AutoTVM] Error in the conv2d auto-tuning tutorial: Cannot find config for target=cuda, workload=None etc

moderato · March 9, 2020, 7:58am

Hello! I get the above error when I update TVM to the latest master branch and run this tutorial:

https://docs.tvm.ai/tutorials/autotvm/tune_conv2d_cuda.html

After this error, the fallback config is used and results in poor runtime performance. I believe this might be caused by some recent changes. Anyone could help? Thanks in advance!

comaniac · March 9, 2020, 4:15pm

The tutorial you posted should include a tuning process to find the best schedule. Did you tune it for a reasonable number of trials (e.g., 3000 to 4000)?

moderato · March 9, 2020, 4:27pm

I tune it for 20 trials by default. Even so, the best runtime (seen from the log) among the 20 trials is much better than the final runtime evaluation. I’m pretty sure the best history config is not loaded for the evaluation. Log:

No: 20	GFLOPS: 129.36/129.36	result: MeasureResult(costs=(0.00178956675,), error_no=0, all_cost=0.9837315082550049, timestamp=1583771181.213391)	[('tile_f', [-1, 2, 16, 4]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 64, 1]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,3509127
Finish loading 20 records

Best config:
[('tile_f', [-1, 2, 16, 4]), ('tile_y', [-1, 1, 7, 1]), ('tile_x', [-1, 1, 1, 7]), ('tile_rc', [-1, 64, 1]), ('tile_ry', [-1, 1, 1]), ('tile_rx', [-1, 1, 1]), ('auto_unroll_max_step', 1500), ('unroll_explicit', 0)],None,3509127
Finish loading 20 records
Cannot find config for target=cuda, workload=None. A fallback configuration is used, which may bring great performance regression.
Time cost of this operator: 0.048827

comaniac · March 9, 2020, 5:16pm

I see. The problem is coming from autotvm.get_config() in the schedule function. It always uses None workload as the query key so that none of tuned records will be matched. In the TOPI schedules, instead, we explicitly call context.query(target, workload) when dispatching.

The following workaround could get this tutorial working again, but I don’t think this is the ideal practice we want for end users. @haichen should we improve get_config() to accept a workload and use cfg = get_config(get_workload(conv2d)) for customized AutoTVM template?

def conv2d_no_batching(N, H, W, CO, CI, KH, KW, stride, padding, cfg=None):
    ...
    cfg = autotvm.get_config() if cfg is None else cfg
...

# apply history best from log file
with autotvm.apply_history_best('conv2d.log'):
    with tvm.target.create("cuda"):
        best_config = autotvm.task.dispatcher.DispatchContext.current.query(task.target, task.workload)
        s, arg_bufs = conv2d_no_batching(N, H, W, CO, CI, KH, KW, strides, padding, best_config)
        func = tvm.build(s, arg_bufs)

haichen · March 10, 2020, 10:19pm

I agree with @comaniac that we should allow get_config to take a workload as an optional argument. I’ll prepare a PR to fix this.

haichen · March 11, 2020, 12:10am

The PR #5034 should fix this issue

moderato · March 11, 2020, 9:11pm

Thank you @haichen @comaniac both for looking into it! It works now after patching.