What's the state of TOPI on server-class CPU?

I want to deploy cross&&deep network by NNVM on server-class CPU. And I’m trying to build the graph in incrementally.tvm_model_zoo

As far as I know, the quality of kernels has a big impact on the entire performance.

So, what’s the state of TOPI on server-class CPU? Compare with openblas and nnpack?

And could anyone give some advise about implement the cross side? Invent custom operator?
cross