Hi All, I’m a TVM beginner, so thanks in advance for your patience
I’ve followed the relevant tutorials on Android deployment, as well as tuning ConvNets for ARM. Now I’m tuning a UNet CNN that uses MobileNetV2 as a backbone in a Encoder-Decoder architecture on a Snapdragon 855 mobile phone. I’ve also converted to FP16 using a mixed precision pass. It works well with meaningful a speedup relative to TFlite. As expected, tuning is rather slow. Looking at the tasks generated by
target = "llvm -device=arm_cpu -model=snapdragon855 -mtriple=aarch64-linux-android -mcpu=kryo -mattr=+neon,+fullfp16,+fp-armv8" tasks = autotvm.task.extract_from_program(mod["main"], target=target, params=params)
I get tasks like Task(func_name=conv2d_NCHWc.x86 as well as tasks like Task(func_name=conv2d_nchw_spatial_pack.arm_cpu. I don’t understand why x86 tasks are generated and how they are useful for an ARM based architecture? Are those a waste of tuning time budget? If so, how to remove them? Otherwise, why are they needed?
In addition, it seems that I only have conv2d tasks. Of course, those are the slowest. But does it make sense to tune bias_adds and Relus too?