Inference time fluctuation

gnupdev · March 3, 2021, 8:37am

Thanks you for replying!

using mkl_verbose=1, i found that tvm_num_threads do not affect threads used by mkl.

so i used mkl_num_threads and resolved problems(fluctuation, slow)

with and without -libs=mkl, the inference time is measured approximately the same.

While searching for the reason, I found out that tvm uses mkl to optimize the only dense layer. I also found that auto-tvm can tune the dense layer.

As a result, if not use mkl, the tvm default tuning option applies to all layers. If use it, mkl would be applied a dense layer. is this right?

If I’m right, auto-tvm (default) auto-tvm + mkl(only dense)

these two cases show similar performance, and can i say that the schedule primitives of tvm show as much performance as mkl?