Hello!
I’m trying to create a strategy for our custom accelerator that is able to accelerate certain element wise operations e.g. element-wise sum, perform relu or tanh on multiple tensor elements at once etc. I’m trying to map these operations to hardware primitives (through C function calls) with tensorize (see [TE] Tensorize Elementwise Sum).
Currently I have succeeded in this purpose for a very simple element-wise sum by altering the injective strategy for our accelerator, but now all injective operations are scheduled as a sum. I have a few questions about this and how to make it more generally applicable instead of having it break other scheduling steps.
I’ve seen that the conv2d strategy uses attributes to differentiate between different topi operators, but how is this done for injective operators? e.g. What is the best way to test if i’m using an addition injective operator instead of applying an activation or doing an elementwise product? So I can map the right tensorization to each respective operation (C function call)? I haven’t found a target injective strategy in /python/relay/op/strategy that seems to test for this? What would be the best way to go about this? Am I maybe looking for the wrong type of schedule here, is injective not correct for an element-wise sum?
Thanks in advance for your thoughts and recommendations!