Hi @Yashwanth ,
You can take a look at what I presented in Presenting the generation of code for the Gemmini accelerator - microTVM - Apache TVM Discuss (and the associated paper [2212.03034] Integration of a systolic array based hardware accelerator into a DNN operator auto-tuning framework (arxiv.org)), which sounds exactly like your use case.