I want to modify VTA core for improving performance. I make plan to add new instructions and modify GEMM/ALU structure into a systolic array form.
My final goal is to run yolov3-tiny model with custom VTA core, as provided in the VTA example.
I can modify VTA core using custom HLS code, but I don’t know what parts of TVM stack are required to modify and add.
Can someone tell me what documents and codes to review for this?