TVM transformer pruning results in slowdown

You need to tune you model. See Performance regression of sparse BERT example