Hello @abhikran-quic,
Thanks for raising this post, I am also interested in generating some subgraphs from an existing graph to run on different CPU/accelerators.
In my previous work, I have followed @hjiang’s old post to split the existing graph into N different subgraphs.
However, as my previous post mentioned, I found out each subgraph can only have one global output, wherein is the last operation.
When I check the data dependency, I notice there are another dependencies other than the last operation:
For example, in my post, %42 of the first subgraph → %x1: Tensor[(1, 1, 1, 128), float32] of the second subgraph. This operation is constant that goes to every layer. (e.g, %19 in second subgraph) However, for this operation, I cannot send the data dependency to the next subgraph since it is not registered as global output in the first subgraph.
Thus, I am wondering is it possible for that user can we register operations in Relay IR as new outputs to read them out (or send them to another subgraph, in my case).
Moreover, can @abhikran-quic share more information regarding which documents do you follow or how do you use the PartitionGraph function?
Thanks for your help
cc @comaniac