Have you looked at the existing schedule (templates) in TVM? Many of them may already use variations of the strategies that you bring up here. If you have a graph declaration of your network (e.g., in terms of convolution operations, etc) it will likely be possible to automatically optimize this task.