Hello! Wow! The runtimes reduced to as previously!
Extract tasks...
Compile...
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 3, 224, 224), 'float32'), ('TENSOR', (64, 3, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_winograd.arm_cpu', ('TENSOR', (1, 3, 224, 224), 'float32'), ('TENSOR', (64, 3, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 64, 224, 224), 'float32'), ('TENSOR', (64, 64, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_winograd.arm_cpu', ('TENSOR', (1, 64, 224, 224), 'float32'), ('TENSOR', (64, 64, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 64, 112, 112), 'float32'), ('TENSOR', (128, 64, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_winograd.arm_cpu', ('TENSOR', (1, 64, 112, 112), 'float32'), ('TENSOR', (128, 64, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 128, 112, 112), 'float32'), ('TENSOR', (128, 128, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_winograd.arm_cpu', ('TENSOR', (1, 128, 112, 112), 'float32'), ('TENSOR', (128, 128, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 128, 56, 56), 'float32'), ('TENSOR', (256, 128, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_winograd.arm_cpu', ('TENSOR', (1, 128, 56, 56), 'float32'), ('TENSOR', (256, 128, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 256, 56, 56), 'float32'), ('TENSOR', (256, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_winograd.arm_cpu', ('TENSOR', (1, 256, 56, 56), 'float32'), ('TENSOR', (256, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 256, 28, 28), 'float32'), ('TENSOR', (512, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_winograd.arm_cpu', ('TENSOR', (1, 256, 28, 28), 'float32'), ('TENSOR', (512, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 512, 28, 28), 'float32'), ('TENSOR', (512, 512, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_winograd.arm_cpu', ('TENSOR', (1, 512, 28, 28), 'float32'), ('TENSOR', (512, 512, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_spatial_pack.arm_cpu', ('TENSOR', (1, 512, 14, 14), 'float32'), ('TENSOR', (512, 512, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('conv2d_nchw_winograd.arm_cpu', ('TENSOR', (1, 512, 14, 14), 'float32'), ('TENSOR', (512, 512, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('dense_nopack.x86', ('TENSOR', (1, 4096), 'float32'), ('TENSOR', (1000, 4096), 'float32'), None, 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('dense_nopack.x86', ('TENSOR', (1, 4096), 'float32'), ('TENSOR', (4096, 4096), 'float32'), None, 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=arm_cpu -target=aarch64-linux-gnu, workload=('dense_nopack.x86', ('TENSOR', (1, 25088), 'float32'), ('TENSOR', (4096, 25088), 'float32'), None, 'float32'). A fallback configuration is used, which may bring great performance regression.
Upload...
Evaluate inference time cost...
Mean inference time (std dev): 1785.88 ms (5.61 ms)
Could you please tell me more about this black magic?
Also: is this a bug? Is it some kind of incompatibility of TVM settings and my ARM board? How to proceed @haichen ?
Thanks for looking into this & Best regards,
Rob