How to extend NNVM/TVM as a static compiler for HW accelerator like NVDLA?

I have an interest in using NNVM/TVM as a static compiler to generate a runtime image for HW accelerator like that of NVDLA.

NVDLA accelerates at a much higher level of abstraction than CPU and GPU, e.g. conv2d, batchborm, etc.

Graph level optimisation like ops fusion by NNVM will be very useful.

Can you advise how best I should approach, e.g. relevant topics in the documentation/tutorial etc.

Thanks.
–kahho

1 Like

Hi @teabun,

The NVDLA is indeed a very popular open source deep learning accelerator design.

I suggest that we start a github issue if you wish to get started on this task and get the community involved.

Here’s how I suggest that you proceed:

  • Start building a code-generation back-end either statically, or using JIT compilation (see how VTA JIT compiler works: https://tvm.ai/vta)
  • Use NVDLA simulator as a TVM execution backend to test the correctness of your compiler. This may be quite slow, so I recommend putting simple test cases, such as a simple matrix multiply, or a small 2d convolution.
  • Once you’ve tested the functionality of your compiler, start building TOP library for all of the deep learning operators that you wish to offload to NVDLA.
  • This library will one that you can call into from NNVM, or Relay.
  • Test an end-to-end flow on your simulator backend.
  • If you wish to build an FPGA prototype of the NVDLA, I suggest that you start a separate issue to just build the hardware. One of the challenges will be to pick the FPGA platform, FPGA-to-CPU communication model, and building a driver stack that your host code will call into to offload work to the NVDLA.

Hope this provides some guidance. For a simple deep learning accelerator design built for TVM, please visit our VTA page: https://tvm.ai/vta

1 Like

Hi @thierry,

Thank you very much for the suggestions; it think these are good ways to start to marry these two open source projects. NVDLA’s compiler has been pretty delayed according to their milestone; and having an alternative compiler and runtime environment targeting NVDLA should also benefit the communities.

Are you aware if anyone or team has started (or starting) this?

Regards.
–kahho

We have heard interest in supporting NVDLA from collaborators, but not from concrete efforts that attempted to do add official support.

Do you mind putting up an github issue to flesh out this work and find potential collaborators? I will be happy to provide directions.

Thierry

Hi @teabun,
recently I found a nvdla_compiler released by another IC company. however, currently only support lenet. I am trying to add support for alexnet, but move forward slowly duo to lack internal rationale of nvloadable and KMD scheduler. Also, the code seems hard to extend.

I am here also for other potentials such as TVM + NVDLA. Maybe this is the right way in the long run.

as I’m newcomer of TVM, I wonder if you have any plans or discussion, I don not see any issue on tvm github repo.

@thierry was there any progress or task for NVDLA+TVM integration?

Not from our end at the moment. In light of the recent open-source, I think it’s worth reconsidering the integration.

yes, that would be great. Let me know if any help required.

Maybe you can have a try of this https://github.com/soDLA-publishment/soDLA, which is a chisel version of nvdla.

Maybe we can have a try on chisel-nvdla first.

Now, NVDLA was intergrated with Tengine. If you are still interest in it, please follow this tutorial: 基于Tengine的开源加速器工具链 - 知乎