Tvm integration with external DSP accelerator API

Hello, I have a task for implementing tvm codegen generating code from relay operators to an external DSP accelerator with has a C API.

The accelerator API include c functions API, such as:

Void f1 (int* a, int* b char* c);

Void f2 (float* a, int_64* b char* c);

Each operator can be matched with one or several DSP API functions, or alternatively Several relay operators can be matched with one or several API accelerator function.

The issue is that the implementation must be modular.

It should be relatively easy for the user to register relay ops to the corresponding DSP C API functions. Therefore, I would like to create some sort of data structure/dictionary that will map operators to corresponding DSP C API functions.

the API can be in python for example. and the user will be able to add a new entry for relay op and map it to the external API.

It should also include a function per operator that returns true, depending on input and output type, if it can be implemented with the aforementioned API.

This data structure/dictionary with be used in the codegen.

Can someone advice how to define create such a data structure/dictionary which is agnostic of the IR?

Looks like it could be abstracted as calling a packed function…On the low-level you may use call_packed as demonstrated here; CC @yuchenj on the high-level IR

Sounds like a good case for https://github.com/tlc-pack/relax/issues/46

@MajorTom would BYOC + pattern matching against Relay ops work for your use case?

@areusch @ziheng @junrushao1994 Thanks for all your ideas. :slightly_smiling_face:

Yes my general idea is to use BYOC and pattern matching in TVM. My problem is how to make this solution modular and scalable.

Let’s say a user would like to add a new API and link it to some relay op. The user is not familiar with the internals of tvm and relay. I would like to create some dictionary or other data structure where it will be relatively easy to add a new API. ideally this dictionary will use relay op as a key, and the value will include corresponding kernel ID in the DSP and kernel API prototype, and maybe some additional information such as a boolean register function that will return true if it is possible to annotate this op to this DSP with the given inputs. my codegen module will use this dictionary as input and parse it to generate code automatically.

Any idea how to implement something like that?

CC @kparzysz @chuck-pilkington @aquapapaya

@MajorTom have you seen the register_pattern_table pieces? e.g. bnns

If this were something that could be automatically enabled based on TVM Target string, would that suffice for your use case? I believe that is planned work though I haven’t seen any RFC of it yet.

Meanwhile there is a facility for this “one level up” (from tvm.relay.build) in tvmc. we would probably consolidate that underneath tvm.relay.build after landing the auto-partition logic.