[VTA] + [AutoTVM] runtime/compiler errors in transform.py

dsr91 · August 26, 2020, 12:41pm

Hi,

during the tuning of a single Conv2D Layer with AutoTVM on an ultra96 FPGA, most of the schedules have a build or runtime error. Only ~10-20% of the schedules are are actually executed. These executed schedules then perform a correct computation (verified by the check_correctness flag in the RPCRunner). So this seems to be a software/setup issue.

My VTA setup uses the default ultra96 configuration and Conv2D configuration and schedule from VTA/top, tvm version 0.7.dev1

The following errors are the most common ones:

...python/vta/transform.py, line 463, in _get_2d_pattern: RuntimeError: Scope[local.inp_buffer]: cannot detect 2d pattern with elem_block=16: shape=[2, 8, 17, 16], strides=[16384, 512, 16, 1]'

...python/vta/transform.py, line 537, in _inject_copy: ValueError: Do not support pad on the innermost block

My Question:

Is the only source of the errors a schedule that can’t work on VTA? Or is there some kind of issue with my setup (see below)?

Thanks for your time

I adapted the Conv2d optimization tutorial to use AutoTVM:

batch_size = 1
height = 32
width = 32
in_channels = 256 
out_channels = 256
kernel_h = 3
kernel_w = 3
pad_h = 1 
pad_w = 1
stride_h = 1
stride_w = 1
#
# buffer definitions from the tutorial
# register_vta_tuning_task() from tune_relay_vta.py
#
device = "vta"
target = env.target if device == "vta" else env.target_vta_cpu
task = autotvm.task.create("conv2d_packed.vta",
                       args=(data, kernel, (stride_h,stride_w), (pad_h,pad_w), (1,1),"NCHW1n16c", env.acc_dtype),
                       target=target, target_host=env.target_host)
                       
print(task.config_space)
logging.getLogger('autotvm').setLevel(logging.DEBUG)
logging.getLogger('autotvm').addHandler(logging.FileHandler("./single-conv-debug.log"))
tracker_host = "192.168.2.1"
tracker_port = 9190
measure_option = autotvm.measure_option(
    builder=autotvm.LocalBuilder(),
    runner=autotvm.RPCRunner(env.TARGET, host=tracker_host, port=tracker_port, n_parallel=1, number=5, repeat=2, check_correctness=True))
tuner = autotvm.tuner.GATuner(task)
tuner.tune(n_trial=100,
       measure_option=measure_option,
       callbacks=[autotvm.callback.log_to_file('single_conv.log')])

dsr91 · August 28, 2020, 7:42am

Is this a normal behavior? Can anyone share experiences with AutoTVM and VTA? @tqchen @thierry ?

thierry · September 1, 2020, 3:48am

Hi @dsr91, thanks for the question. Indeed the rationale behind using the autotuner was that it would eliminate invalid schedules during the schedule search. For VTA, many schedules lead to invalid code-gen, because it doesn’t pattern match (e.g. conv2d), or for some other reason (runtime checks for valid schedules etc.). Therefore many of the schedules get discarded away.

In general I think there can be a much smarter way to approach the schedule search and applying more constraints but for that experiment on FPGAs it did the trick.

dsr91 · September 1, 2020, 7:46am

Ok, good to know this is the intended behaviour. Thanks for the response

wangshankun · April 29, 2022, 7:37am

But VTA conv2d schedule is define by vta.top.schedule_conv2d_packed； And this schedule template passed the testcase in vta\tests\python\integration\test_benchmark_topi_conv2d.py；

So，Using same schedule template , Why Vta relay build tuning conv2d (tune_relay_vta.py) is wrong, But topi_conv2d vta tuning is OK