I’m interested in working with the tensor2tensor library to build a Transformer-based NMT model. I came across the TVM article about Batch Matmul [http://www.tvmlang.org/2018/03/23/nmt-transformer-optimize.html]. Is there any more information available about how to swap out cuBLAS’s batch matmul function for TVM’s batch matmul function, when using the tensorflow tensor2tensor library?
Thanks!