graph-wise I think you can refer to relay.transform.gradient and as you lower the differentiated graph, you may leverage the tensor-level autodiff (te.gradient). Though tensor gradients now are mostly manually written.
You may refer to test cases
currently it is not supported.