You’re right. There are a few more op patterns added in the code that were not there in the paper. The best understanding I have for the differences is as explained below:
kElemWise applies an operation to each element of the input tensors and so the output would be of the same shape. Example of an access function is A[i,j] = f(B[i,j])
for some function f
. All activation functions and most if not all functions that can be called unary like relu, tanh, sigmoid, exp, etc. are kElemWise
kBroadcast can have an output shape that is greater than or equal to input shape, but the order of axis cannot change. For example, add of 2 tensors of shape (1,100) and (10,100) would have output (10,100), but the way we access the values has to be in the same order for input and output. This is why they say transpose is not broadcasting. They’ve already given the access mapping in the comments
kInjective is any fully injective functions. Almost all the data movement operations like transpose, depth_to_space, resize, reshape, etc. are injective functions.
I think the confusion of one-to-one for injective vs bijective is real as even wikipedia calls injective functions as a one-to-one function whereas a bijective functions is defined as one-to-one correspondence 