Hi all, I am developing a custom NN accelerator hardware. I also provide a C API layer that implements basic operators like conv2d, dense, depthwise conv2d, and others. This implementation utilizes the NN accelerator features in the most optimized way.
I would like to use TVM to parse any frontend (tflite, onnx, keras, …) and eventually generate a C code that calls my customer C API for the operators that I’ve implemented and use the default implementation for the ones not implemented yet.
I would appreciate some guidance or explanation of what would be the correct way to do the above.