Hi! I am a freshman to TVM. Now I want to use TVM to support my accelerator. The hardware is base on Gemmini, and it is RISC-V architechture. Could anyone give me a workflow about how to use TVM to support a new accelerator with its own instruction? thx!
You could refer to this blog post about how to offload a partial model to your accelerator: How to Bring Your Own Codegen to TVM
Thanks a lot! I am very glad you answer my question. But the example of this blog is a library of cpu, I am wondering if there is a guide for developer to bring there ASIC to tvm?
The example in the blog treats the DNNL as the external library, which can be replaced with ASIC library or codegen. If you really need an example of ASIC accelerator, maybe you could check the Arm Ethos-N integration: [BYOC][ETHOSN] Introduce the Ethos-N BYOC integration by mbaret · Pull Request #6222 · apache/tvm · GitHub
thanks a lot for replying me so fast! I have seen the blog, but I still have some questions. If we don’t have a compiler or codegen for the accelerator, can we bring it to tvm? Or should we need to use uTVM? Or should we have to create a compiler or codegen for the DLA?
I don’t understand your question. If you don’t have a compiler or codegen, what “program” does your accelerator execute? An accelerator usually has an ISA and you need to compile a deep learning model to the binary of the ISA so that it can be executed. Or you may have a runtime engine that executes a certain graph format such as JSON or ONNX on the accelerator. In this case, you still need to “compile” the model to the format you are targeting.
Well, now our DLA does not have a codegen or compiler, it just has an ISA. You mean we have to create a compiler or codegen?
hi @wuzheng,
can you speak a bit more about the runtime environment you’d like to support? does your accelerator integrate with a system running a traditional OS such as linux, or is the ASIC meant to be run as bare metal?
if bare metal, µTVM would be the way to go. accelerator support hasn’t landed yet, but is forthcoming and would be done with the BYOC flow @comaniac mentioned above. TVM produces binary instructions using a backend such as LLVM, so if no compiler exists to emit those, you would need to implement one. you also need to build a control library to start and stop compute on the accelerator.
if on linux, you’ll also need to implement a driver. you can still target this accelerator with TVM, but TVM would be emitting an artifact to be passed to the driver.
-Andrew