My responses to 1-3:
- You can customize the annotator to generate subgraph 1 while avoiding subgraph 2:
subgraph 1:
subgaph_begin -> conv2d -> ReLU -> subgraph_end
subgraph 2:
subgraph_begin -> ReLU -> conv2d -> subgraph_end
-
That’s our next step. We will develop an algorithm to group (or say partition/annotate) offloadable ops to one subgraph and the backend can decide weather to fuse them or not.
-
This is a valuable question and we are also investigating it. What’s lacking now is that we need an interface/API to let backend developers represent a tuning space and invoke AutoTVM. This will be a follow-up work, but whatever the interface/API we will come up, it will fall into the current design (specifically, this logic will be in the compile function in the customized codegen).