Proper way for implementing custom injective/element-wise strategy

Hello!

I’m trying to create a strategy for our custom accelerator that is able to accelerate certain element wise operations e.g. element-wise sum, perform relu or tanh on multiple tensor elements at once etc. I’m trying to map these operations to hardware primitives (through C function calls) with tensorize (see [TE] Tensorize Elementwise Sum).

Currently I have succeeded in this purpose for a very simple element-wise sum by altering the injective strategy for our accelerator, but now all injective operations are scheduled as a sum. I have a few questions about this and how to make it more generally applicable instead of having it break other scheduling steps.

I’ve seen that the conv2d strategy uses attributes to differentiate between different topi operators, but how is this done for injective operators? e.g. What is the best way to test if i’m using an addition injective operator instead of applying an activation or doing an elementwise product? So I can map the right tensorization to each respective operation (C function call)? I haven’t found a target injective strategy in /python/relay/op/strategy that seems to test for this? What would be the best way to go about this? Am I maybe looking for the wrong type of schedule here, is injective not correct for an element-wise sum?

Thanks in advance for your thoughts and recommendations!

This may not be the best, but I think one way to do it would be using Operator tag and name to distinguish differently element-wise operations in Injective schedule template. For example, Add has the name T_add and the tag broadcast. You can check more at include/tvm/topi/broadcast.h

But, It may be a little tricky for the Fusion case. :slight_smile:

1 Like

Thanks for your reply @leeexyz ! Yes I was thinking about doing something like that, but indeed, operator fusion worries me a bit too.

To do this; Is it possible to extract the operator tag and name from within python? Because right now if I try for this compute expression:

C = te.compute((ro,co,dim1,dim2), lambda i,j,k,l: A[i,j,k,l] + B[i,j,k,l], name="C")

I get these results when I try to access the various parameters:

>>> s[C].op
compute(C, body=[(A[i, j, k, l] + B[i, j, k, l])], axis=[iter_var(i, range(min=0, ext=26)), iter_var(j, range(min=0, ext=26)), iter_var(k, range(min=0, ext=6)), iter_var(l, range(min=0, ext=2))], reduce_axis=[], tag=, attrs={})

>>> s[C].op.tag
''
>>> s[C].op.name
'C'

I’ve seen T_add popping up in the generated C code, but I cannot seem to access it from python. Should I use the TE C++ interface for this maybe? I can access the body of calculation with a regular expression maybe? But that seems like a bad idea to me…

Thanks!

@JosseVanDelm Only built-in ops have the specified names and tags. Try invoking topi.add directly. If you have a customized op, you have to set it by yourself.

1 Like

I was unaware of that! That makes a lot of sense! Thanks @leeexyz !