Question about Op fusion rule

freshbird2023 · April 12, 2023, 9:07am

I see tvm classify operators into some types:

elementwise broadcast injective …

Now I have two questions:

what’s the difference between the 3 kinds, I didn’t find examples easy to understand.
what’s the relation of the opkind with the schedule? I find the add op is broadcast kind, but the schedule selected when lowering is injective, any detailed information about this?

Thank you!

sanirudh · April 12, 2023, 1:41pm

Check the Operator fusion section in the original TVM paper. That’s the best introductory explanation I’ve found so far. If you still have doubts after reading the paper, feel free to post a follow-up question and I can try to answer.

freshbird2023 · April 13, 2023, 6:45am

thank you! here is my question come from, I find the rule described in the paper is not the same in the c++ implementation. In the paper there are 4 kinds, and the injective is interpreted as one-to-one map.

But:

in the source code its a more detailed classification, I wonder what’s the difference between elmewise, broadcast and injective:

/*! \brief operator pattern used in graph fusion */
enum OpPatternKind {
  // Elementwise operation
  kElemWise = 0,
  // Broadcasting operator, can always map output axis to the input in order.
  // for example :code:`out[i, ax1, j, ax2] = input[i, j]`.
  // Note that the axis need to be in order so transpose is not a bcast operator.
  kBroadcast = 1,
  // Injective operator, can always injectively map output axis to a single input axis.
  // All injective operator can still be safely fused to injective and reduction.
  kInjective = 2,
  // Communicative reduction operator.
  kCommReduce = 3,
  // Complex operation, can still fuse elemwise operations into its output.
  // but cannot chain another complex op
  kOutEWiseFusable = 4,
  // The pattern for tuple nodes. Can fuse into subsequent injective ops,
  // but treated specially
  kTuple = 7,
  // Opaque operation, cannot fuse anything.
  kOpaque = 8
};

in the wikipedia, the meaning of injective if larger than the scope of one-to-one map, seems the bijection is more like a one-to-one map.

Thank you!

freshbird2023 · April 13, 2023, 6:47am

Also, I wonder, sort seems to be an injective op? but seems it’s marked as opaque.

sanirudh · April 14, 2023, 6:09am

You’re right. There are a few more op patterns added in the code that were not there in the paper. The best understanding I have for the differences is as explained below:

kElemWise applies an operation to each element of the input tensors and so the output would be of the same shape. Example of an access function is A[i,j] = f(B[i,j]) for some function f. All activation functions and most if not all functions that can be called unary like relu, tanh, sigmoid, exp, etc. are kElemWise

kBroadcast can have an output shape that is greater than or equal to input shape, but the order of axis cannot change. For example, add of 2 tensors of shape (1,100) and (10,100) would have output (10,100), but the way we access the values has to be in the same order for input and output. This is why they say transpose is not broadcasting. They’ve already given the access mapping in the comments

kInjective is any fully injective functions. Almost all the data movement operations like transpose, depth_to_space, resize, reshape, etc. are injective functions.

I think the confusion of one-to-one for injective vs bijective is real as even wikipedia calls injective functions as a one-to-one function whereas a bijective functions is defined as one-to-one correspondence

sanirudh · April 13, 2023, 3:47pm

I don’t see any ‘TOpPattern’ being defined for relay.sort, but I do see that the topi implementation for sort calls an extern function where they implement sort directly in c runtime. I guess that’s why it is termed as opaque, but I’m not completely sure, maybe someone else might explain that part better.

freshbird2023 · April 14, 2023, 2:30am

Thank you! the explanation is great! I can understand it clearly!

Maybe sort can’t be represented by the compute + schedule in tvm, so the op just calls external function, so it is set as opaque.

But by the definition of sort, it’s an injective op, is that right? it just reorders the elements.

sanirudh · April 17, 2023, 3:19am

Yes, I think that’s true. By definition, the relay.sort op is injective.