Thanks for your suggestions!!
I learning about VTA as you suggested and it seems that DMA tasks is processed through:
- Lable dma task by
pragma
withdma_copy
during scheduling - Inject the lable to dma intrinsic function defined in
runtime
bycall_extern
during lowing.
Am I right?
By the way, the backend module (op decl & sch -> tvm ir -> apply ir_pass -> codegen) is really not easy to read for me, with complex and deep class structure. I would appreciate that if you could provide some guidance on that.