Graph rewrite with multi-output patterns

    A
    |
    B
  /   \
C       D
|       |
E       F

I intend to rewrite A,B,C,D with a fused function G.

    G
  /   \
E       F

Does the current pattern matching scheme support subgraph with multiple outputs? Any suggestion on how to achieve this? Thanks!

No, the pattern with multiple outputs is not supported. Since we leverage post-order traversal to match patterns, output is actually the entry point and must be single. As a result, the current pattern requires a post-dominated node at the end.

I guess we could probably use two-phase pattern matching to match a partial pattern and combine than afterwards, but it seems too complicate to be done easily.

@mbrookhart do you have any suggestion or workaround for this case?

1 Like

I have on my internal backlog “Support multi-output pattern matching for RNN fusion” and have never gotten around to it. :frowning:

It’s a fairly non-trivial refactor. We’d need to introduce a multi-output node in the pattern matcher AST (basically a tuple of outputs that doesn’t match like a tuple), find the first one, and then do bi-directional matching in the pattern matcher to to ensure that the pattern and the graph connect all of the outputs to the same roots. The data structures are in place, I used bidirectional graphs for the domination analysis, but like I said, it’s a fairly complex rewrite of the C++ side of the pattern matcher. I’d be happy to coach someone on that if they have an interest in AST manipulation, but I’m not sure how quickly I’d get to it myself.

As for a workaround…everything I’m thinking of is really hacky…

1 Like

I guess I have another case:

  A  D  H
  |  |  |
  B  E  I
  |  |  |
  C  F  J

to

  A   D   H
    \ | /
      G
    / | \
  C   F   J

Essentially, I can match A-B-C, D-E-F, H-I-J patterns. Can I group them to one relay function? My ultimate goal is to match a group of connected functions and offload to something like byoc. Since the replacing function is highly hackable, as long as the inputs and outputs of the matched pattern are in control, we could get away…

I am not talking about general graph rewrite approach for all models. Yet, such hack can enable aggressive fusion strategies, which lead to better perf for specific models.

This is a bit similar to the Relay passes that combine parallel ops. You might look at CombineParallelDense and CombineParallelBatchMatmul for some ideas. If B, E, I are dense ops, for example, then CombineParallelDense may produce your expected result.

1 Like

I also face the problem of matching multi-output patterns. Is there any new progress for this problem?