[RFC] 'Cascade' Scheduling

From your response to TQ I feel that the current limitation is mainly from the TOPI implementation, which fixed the granularity of conv2d schedule. This limitation, however, can potentially be resolved by auto-scheduler (Ansor). In fact, we AWS is attempting to use Ansor to perform more a aggressive scheduling on fused ops (e.g., two conv2ds), and one core idea is exactly leveraging compute_at. We’ve submitted the proposal and initial results of this project to TVM conference and hopefully we could have a chance to share it there.