Target-dependent graph transformation in NNVM

I am wondering whether NNVM supports target-dependent transformations.For example, some fusion operations or layout transformations may make sense on one target but not on others.
From the look I had at the source code, it seems that graph lowering can be target-dependent but the transformations cannot be. Am I missing something?

Thanks!

Layout transform can be controlled by backends. x86 and cuda backends do layout transform differently.

Operator fusion is meant to be target agnostic at the moment. Do you have an specific example in mind where some fusion is not appropriate for certain devices?

Hi Masahi,
I have a fission (opposite of fusion) example. Let’s say I have two matrix multipliers on my target architecture and I would like to break up existing matmuls of the graph into finer grained nodes, possibly fuse resulting nodes with other neighboring nodes so that I can later assign each fused node to a separate matrix multiplier. The fission (or break up) pass depends on how many matrix multipliers I have and their sizes.
I understand that everything can theoretically be done in the backend as well but doing it at the graph level is much easier/more natural while the high-level information about the ops are available.

1 Like