I am trying to wrap my head around the VTA Software implementation. I found the transform.py file and want to ask how these are called? As an example, let’s look at InjectDMAIntrin. There must be some way to identify the part of the TIR where this operation is to be inserted. But how is that done?
In general, I want to understand how to incorporate custom accelerators in TVM. So far I understand the pattern match and replace part, and now I want to take the next step and understand how the code generation works.
I am not sure to get the initial question well but I can share some resources for code generation.
Just about me, I am relatively new to TVM and compilation, I’ve managed to get a dummy codegen running last week under the unity branch.
Once you’ll know how to write a codegen you’ll need to know how to get it called. There is a new branch of TVM in active development which makes calling into BYOC much more natural and flexible among other things, it’s called unity. If that interests you, there is a notebook
showcasing how to you can get your BYOC codegen running under unity.
For calling your codegen without unity you may find what you need in this BYOC blogpost.
Note: The BYOC documentation and blogpost might not be fully up to date, I suggest using existing implementations as a complement to validate. For unity it’s in src/relax/backend/contrib (see the cutlass implementation for a C codegen and the tensorrt one for a json codegen) and for pre-unity in src/relay/backend/contrib.
Thank you! I am aware of the blogs, but none of them seem to describe the way it is done for VTA. I have found something on the conv2d specifically, but nothing so far that explains the contents of the transforms.py file.
Although, it might be best to ignore the VTA implementation and just do it on the Unity branch. But I still want to know how these Inject functions are called.
I spent some more time looking into this. On Unity, at least the pattern matching and code gen seem to be much better integrated, so that’s already nice. The thing that interests me probably happens only after the tutorial, as I want to add a completely new target to TVM. Probably that’s also where the transform.py file comes into play, as you have to tell TVM when and where to insert certain function calls.
But I still can’t find any documentation of that last part. In any case, thanks for the hint!
Ok, I figured it out. If someone comes across this in the future:
The operations performed in the mentioned file are inserted using the InjectCopyIntrin pass. It checks TIR for pragmas and inserts the corresponding function call when generating code. E.g. the InjectDMAIntrin inserts DMA access in all locations where the “dma_copy” pragma is set (see the matching call of InjectCopyIntrin).
The pragmas are set during schdule construction, for examples take a look at the way it is done in vta_conv2d.py or the other operators defined in the same folder.