Yes, the key is to use compute_at(…). For example, x86 schedule uses it here to fuse convolution and the following operation (bias add, batch norm, relu).
Imagine how you would implement fused convolution. Let’s say we target GPU. Before you can start the second convolution on a single pixel, you have to wait neighbor pixels to finish their first convolution. This requires global sync at shared memory boundary. Since we need to store the output of the first convolution to the global memory, we don’t have any benefit from fusing.
For other architectures it might be doable, but at least in TVM we don’t fuse consecutive convolutions.
NNVM (at least v1) had fusion rules which prevented an automatic fusioning (at NNVM level) of two neighbouring convolution layers, therefore all automatic generated TVM “tasks” (i.e. composition of stages) had only one conv layer
It is not possible to generate TVM tasks which describe two neighbouring convs
It is not possible to use TVM scheduling primitives to fuse (i.e. tvm.compute_at) two convs
AFAIK
Is true but is a limitation posed from how NNVM (v1?) was used during operator fusion.
Is false. You can check by defining two tvm.compute which describe two conv2ds and use tvm.lower go get a printout
produce conv1_res{
//code which implements conv2d goes here
}
produce conv2_res{
//code which implements conv2d goes here
}
Is undefined (I havent tried it). Conceptually, I think it is possible since there is an obvious produce consumer relation and the tensors shape relations are also known.
You mention that it requires a global sync, however, the sync is not necessary when we have redundant computations.
There are many examples in the Halide papers.
In fact, where and when to compute the pixels can bring different trade-off between producer-consumer locality, input locality, and redundant computation.
In my opinion, it can also bring a larger exploration space for performance tuning, when fusing two convs.
I think TVM is capable of generating the code that fuses two convs, but it cannot now.