[AOT] Supporting BYOC kernels in unpacked interface api

The implementation of the unpacked interface api for the AoT codegen is very great and has advantages over the traditional packed calling convention.

I realized that the usage of BYOC kernels in combination with the unpacked interface is currently not supported, probably because of the lack of an interface to let the codegen know which entry point shall be generated.

Are there any plane to get rid of that limitation in the future?

I came of with the flollowing “ideas”:

The simplest approach would probably be generating BYOC entry points for both types of interface using different function names and let the AoT codegen decide which function name to use for the call depending on the chosen interface. (e.g use _packed suffix for packed interface)

Another option would be adding a way to pass the information on the required interface to the BYOC codegen which when can generate the appropriate entry point. This might be more complicated but such interface could eventually be used for other features as well in the future.

1 Like

CC @Mousius @giuseros @areusch @tqchen

Hi @PhilippvK,

Thanks for putting up a thread on this, it has definitely occurred to me and I believe the solution is hidden in plain sight :smile_cat:

BYOC can support the unpacked API if you change your BYOC code generator to output the arguments with the correct signature, (void* arg0, void* arg1, void* arg2) instead of the packed function (TVMValue* args, int* type_code, int num_args, TVMValue* out_value, int* out_type_code). This then means your BYOC output will only support the unpacked API, which may or may not be desirable, I’d suggest your option B of adding the information to BYOC via options specific to the code generator would be the fix here (see: BYOC Options)?

Improvements to this are underway with the combination of Target Hooks and Migrating Target Attributes to IRModule, this combination means the configuration for when to use the unpacked API will be available to RelayToTIR and TIRToRuntime via one of:

  • TVMs default selection of packed vs unpacked in driver_api.cc as it processes the TIR
  • Custom code generation logic which can inspect the IRModules configuration

There’s still an issue here with full RelayToRuntime transformations which only take the Function and no reference to the IRModule - this should eventually be considered but I haven’t got that far yet :smile_cat: Ideally we’d give a full module Pass which consumes the relevant BYOC passes and gives them a full module view.

@PhilippvK thanks for pointing this out. It seems like the “correct” immediate fix would be to pass the Target string to the BYOC backend. However, I’m not sure that’s the right thing to do (it could be)–because the --unpacked-api parameter really shouldn’t belong there. We intend to move it elsewhere e.g. creating a runtime-config struct.

cc @mbs-octoml @jroesch who are involved with this effort.

Makes sense that all passes – whether built in or contributed via a hook mechanism – should see the same compilation configuration, which includes the actual Target object for post-lowering passes.

https://github.com/apache/tvm-rfcs/pull/29 suggests conveying that via the IRModule attributes. I’m leaning towards passing around a ComplierConfig which every Pass (and non-Pass-signature hooks) get passed.