How to use my cuda op implementation in TVM?

We are wondering whether it is possible to implement relay IR with handwritten cuda code? How to do that?

That is

  Relay -> TE -> TIR -> C++/CUDA (old path)
        -> CUDA (new path)

Any help would be greatly appreciated.