Autotvm task in tune_relay_arm.py

Hi,

When I run the tutorial in Auto-tuning a Convolutional Network for ARM CPU — tvm 0.8.dev0 documentation, I found the tasks from autotvm are:

[Task 1/28 conv2d_nchw_spatial_pack.arm_cpu] Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=((‘TENSOR’, (1, 512, 7, 7), ‘float32’), (‘TENSOR’, (512, 512, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_spatial_pack.arm_cpu’, (‘TENSOR’, (1, 512, 7, 7), ‘float32’), (‘TENSOR’, (512, 512, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’)) [Task 2/28 conv2d_NCHWc.x86] Task(func_name=conv2d_NCHWc.x86, args=((‘TENSOR’, (1, 512, 7, 7), ‘float32’), (‘TENSOR’, (512, 512, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’), kwargs={}, workload=(‘conv2d_NCHWc.x86’, (‘TENSOR’, (1, 512, 7, 7), ‘float32’), (‘TENSOR’, (512, 512, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’)) [Task 3/28 conv2d_nchw_winograd.arm_cpu] Task(func_name=conv2d_nchw_winograd.arm_cpu, args=((‘TENSOR’, (1, 512, 7, 7), ‘float32’), (‘TENSOR’, (512, 512, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_winograd.arm_cpu’, (‘TENSOR’, (1, 512, 7, 7), ‘float32’), (‘TENSOR’, (512, 512, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’)) [Task 4/28 conv2d_nchw_spatial_pack.arm_cpu] Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=((‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (512, 256, 3, 3), ‘float32’), (2, 2), (1, 1, 1, 1), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_spatial_pack.arm_cpu’, (‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (512, 256, 3, 3), ‘float32’), (2, 2), (1, 1, 1, 1), (1, 1), ‘float32’)) [Task 5/28 conv2d_NCHWc.x86] Task(func_name=conv2d_NCHWc.x86, args=((‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (512, 256, 3, 3), ‘float32’), (2, 2), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’), kwargs={}, workload=(‘conv2d_NCHWc.x86’, (‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (512, 256, 3, 3), ‘float32’), (2, 2), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’)) [Task 6/28 conv2d_nchw_spatial_pack.arm_cpu] Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=((‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (256, 256, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_spatial_pack.arm_cpu’, (‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (256, 256, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’)) [Task 7/28 conv2d_NCHWc.x86] Task(func_name=conv2d_NCHWc.x86, args=((‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (256, 256, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’), kwargs={}, workload=(‘conv2d_NCHWc.x86’, (‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (256, 256, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’)) [Task 8/28 conv2d_nchw_winograd.arm_cpu] Task(func_name=conv2d_nchw_winograd.arm_cpu, args=((‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (256, 256, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_winograd.arm_cpu’, (‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (256, 256, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’)) [Task 9/28 conv2d_nchw_spatial_pack.arm_cpu] Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=((‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (256, 128, 3, 3), ‘float32’), (2, 2), (1, 1, 1, 1), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_spatial_pack.arm_cpu’, (‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (256, 128, 3, 3), ‘float32’), (2, 2), (1, 1, 1, 1), (1, 1), ‘float32’)) [Task 10/28 conv2d_NCHWc.x86] Task(func_name=conv2d_NCHWc.x86, args=((‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (256, 128, 3, 3), ‘float32’), (2, 2), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’), kwargs={}, workload=(‘conv2d_NCHWc.x86’, (‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (256, 128, 3, 3), ‘float32’), (2, 2), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’)) [Task 11/28 conv2d_nchw_spatial_pack.arm_cpu] Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=((‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (128, 128, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_spatial_pack.arm_cpu’, (‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (128, 128, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’)) [Task 12/28 conv2d_NCHWc.x86] Task(func_name=conv2d_NCHWc.x86, args=((‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (128, 128, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’), kwargs={}, workload=(‘conv2d_NCHWc.x86’, (‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (128, 128, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’)) [Task 13/28 conv2d_nchw_winograd.arm_cpu] Task(func_name=conv2d_nchw_winograd.arm_cpu, args=((‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (128, 128, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_winograd.arm_cpu’, (‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (128, 128, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’)) [Task 14/28 conv2d_nchw_spatial_pack.arm_cpu] Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=((‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (128, 64, 3, 3), ‘float32’), (2, 2), (1, 1, 1, 1), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_spatial_pack.arm_cpu’, (‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (128, 64, 3, 3), ‘float32’), (2, 2), (1, 1, 1, 1), (1, 1), ‘float32’)) [Task 15/28 conv2d_NCHWc.x86] Task(func_name=conv2d_NCHWc.x86, args=((‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (128, 64, 3, 3), ‘float32’), (2, 2), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’), kwargs={}, workload=(‘conv2d_NCHWc.x86’, (‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (128, 64, 3, 3), ‘float32’), (2, 2), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’)) [Task 16/28 conv2d_nchw_spatial_pack.arm_cpu] Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=((‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (64, 64, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_spatial_pack.arm_cpu’, (‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (64, 64, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’)) [Task 17/28 conv2d_NCHWc.x86] Task(func_name=conv2d_NCHWc.x86, args=((‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (64, 64, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’), kwargs={}, workload=(‘conv2d_NCHWc.x86’, (‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (64, 64, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’)) [Task 18/28 conv2d_nchw_winograd.arm_cpu] Task(func_name=conv2d_nchw_winograd.arm_cpu, args=((‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (64, 64, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_winograd.arm_cpu’, (‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (64, 64, 3, 3), ‘float32’), (1, 1), (1, 1, 1, 1), (1, 1), ‘float32’)) [Task 19/28 conv2d_nchw_spatial_pack.arm_cpu] Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=((‘TENSOR’, (1, 3, 224, 224), ‘float32’), (‘TENSOR’, (64, 3, 7, 7), ‘float32’), (2, 2), (3, 3, 3, 3), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_spatial_pack.arm_cpu’, (‘TENSOR’, (1, 3, 224, 224), ‘float32’), (‘TENSOR’, (64, 3, 7, 7), ‘float32’), (2, 2), (3, 3, 3, 3), (1, 1), ‘float32’)) [Task 20/28 conv2d_NCHWc.x86] Task(func_name=conv2d_NCHWc.x86, args=((‘TENSOR’, (1, 3, 224, 224), ‘float32’), (‘TENSOR’, (64, 3, 7, 7), ‘float32’), (2, 2), (3, 3, 3, 3), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’), kwargs={}, workload=(‘conv2d_NCHWc.x86’, (‘TENSOR’, (1, 3, 224, 224), ‘float32’), (‘TENSOR’, (64, 3, 7, 7), ‘float32’), (2, 2), (3, 3, 3, 3), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’)) [Task 21/28 conv2d_nchw_spatial_pack.arm_cpu] Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=((‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (64, 64, 1, 1), ‘float32’), (1, 1), (0, 0, 0, 0), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_spatial_pack.arm_cpu’, (‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (64, 64, 1, 1), ‘float32’), (1, 1), (0, 0, 0, 0), (1, 1), ‘float32’)) [Task 22/28 conv2d_NCHWc.x86] Task(func_name=conv2d_NCHWc.x86, args=((‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (64, 64, 1, 1), ‘float32’), (1, 1), (0, 0, 0, 0), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’), kwargs={}, workload=(‘conv2d_NCHWc.x86’, (‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (64, 64, 1, 1), ‘float32’), (1, 1), (0, 0, 0, 0), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’)) [Task 23/28 conv2d_nchw_spatial_pack.arm_cpu] Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=((‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (128, 64, 1, 1), ‘float32’), (2, 2), (0, 0, 0, 0), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_spatial_pack.arm_cpu’, (‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (128, 64, 1, 1), ‘float32’), (2, 2), (0, 0, 0, 0), (1, 1), ‘float32’)) [Task 24/28 conv2d_NCHWc.x86] Task(func_name=conv2d_NCHWc.x86, args=((‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (128, 64, 1, 1), ‘float32’), (2, 2), (0, 0, 0, 0), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’), kwargs={}, workload=(‘conv2d_NCHWc.x86’, (‘TENSOR’, (1, 64, 56, 56), ‘float32’), (‘TENSOR’, (128, 64, 1, 1), ‘float32’), (2, 2), (0, 0, 0, 0), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’)) [Task 25/28 conv2d_nchw_spatial_pack.arm_cpu] Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=((‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (256, 128, 1, 1), ‘float32’), (2, 2), (0, 0, 0, 0), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_spatial_pack.arm_cpu’, (‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (256, 128, 1, 1), ‘float32’), (2, 2), (0, 0, 0, 0), (1, 1), ‘float32’)) [Task 26/28 conv2d_NCHWc.x86] Task(func_name=conv2d_NCHWc.x86, args=((‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (256, 128, 1, 1), ‘float32’), (2, 2), (0, 0, 0, 0), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’), kwargs={}, workload=(‘conv2d_NCHWc.x86’, (‘TENSOR’, (1, 128, 28, 28), ‘float32’), (‘TENSOR’, (256, 128, 1, 1), ‘float32’), (2, 2), (0, 0, 0, 0), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’)) [Task 27/28 conv2d_nchw_spatial_pack.arm_cpu] Task(func_name=conv2d_nchw_spatial_pack.arm_cpu, args=((‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (512, 256, 1, 1), ‘float32’), (2, 2), (0, 0, 0, 0), (1, 1), ‘float32’), kwargs={}, workload=(‘conv2d_nchw_spatial_pack.arm_cpu’, (‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (512, 256, 1, 1), ‘float32’), (2, 2), (0, 0, 0, 0), (1, 1), ‘float32’)) [Task 28/28 conv2d_NCHWc.x86] Task(func_name=conv2d_NCHWc.x86, args=((‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (512, 256, 1, 1), ‘float32’), (2, 2), (0, 0, 0, 0), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’), kwargs={}, workload=(‘conv2d_NCHWc.x86’, (‘TENSOR’, (1, 256, 14, 14), ‘float32’), (‘TENSOR’, (512, 256, 1, 1), ‘float32’), (2, 2), (0, 0, 0, 0), (1, 1), ‘NCHW’, ‘NCHW’, ‘float32’))

But the tasks in the annotation are:

[Task 1/12] Current/Best: 22.37/ 52.19 GFLOPS | Progress: (544/1000) | 406.59 s Done. [Task 2/12] Current/Best: 6.51/ 18.77 GFLOPS | Progress: (608/1000) | 325.05 s Done. [Task 3/12] Current/Best: 4.67/ 24.87 GFLOPS | Progress: (480/1000) | 372.31 s Done. [Task 4/12] Current/Best: 11.35/ 46.83 GFLOPS | Progress: (736/1000) | 602.39 s Done. [Task 5/12] Current/Best: 1.01/ 19.80 GFLOPS | Progress: (448/1000) | 262.16 s Done. [Task 6/12] Current/Best: 2.47/ 23.76 GFLOPS | Progress: (672/1000) | 563.85 s Done. [Task 7/12] Current/Best: 14.57/ 33.97 GFLOPS | Progress: (544/1000) | 465.15 s Done. [Task 8/12] Current/Best: 1.13/ 17.65 GFLOPS | Progress: (576/1000) | 365.08 s Done. [Task 9/12] Current/Best: 14.45/ 22.66 GFLOPS | Progress: (928/1000) | 724.25 s Done. [Task 10/12] Current/Best: 3.22/ 15.36 GFLOPS | Progress: (864/1000) | 564.27 s Done. [Task 11/12] Current/Best: 11.03/ 32.23 GFLOPS | Progress: (736/1000) | 635.15 s Done. [Task 12/12] Current/Best: 8.00/ 21.65 GFLOPS | Progress: (1000/1000) | 1111.81 s Done.

This seems to have more tasks like conv2d_NCHWc.x86. But why when I run Autotvm in arm_cpu, I have the tasks like conv2d_NCHWc.x86? Is this expected behavior?