I set the target with ‘llvm -libs=cblas’, but the following warning is casted:
WARNING:autotvm:Cannot find config for target=llvm -libs=cblas, workload=(‘dense_cblas.x86’, (‘TENSOR’, (1, 128), ‘float32’), (‘TENSOR’, (64, 128), ‘float32’), None, 'float32 '). A fallback configuration is used, which may bring great performance regression.
Do I still need to auto-tune the model if I just use the cblas ?
No you don’t, although this message is definitely a bit confusing. We register library implementation as a one-config AutoTVM template so that we can make a comparison with other implementations. For example, dense_cblas.x86
takes 1e-3 ms while dense_pack.x86
takes 3e-3 ms. Then the op strategy will select dense_cblas.x86
when ApplyHistoryBest
, and vice versa.
I tuned dense op and want to compare its performance with cblas.
when I compile it with target = ‘llvm -libs=cblas’ , I got this warning, and the performance is far worse than the tuned one(about 10 times slower). Is that normal?
I’m not sure I got the right performance for cblas by setting the target with ‘llvm -libs=cblas’.
Do you have the benchmark of tvm auto tune and cblas ?
I don’t have the benchmark, but cblas should not perform 10x worse than TVM in dense ops. You may need to provide more information such as tuning/build script and the tuning log snippet for people to help investigate.
def select_implementation(op, attrs, inputs, out_type, target, use_autotvm=True):
all_impls = get_valid_implementations(op, attrs, inputs, out_type, target)
best_plevel_impl = None
for impl in all_impls:
if best_plevel_impl is None or impl.plevel > best_plevel_impl.plevel:
best_plevel_impl = impl
if not use_autotvm:
outs = best_plevel_impl.compute(attrs, inputs, out_type)
return best_plevel_impl, outs
outputs = {}
best_autotvm_impl = None
best_cfg = None
dispatch_ctx = autotvm.task.DispatchContext.current
for impl in all_impls:
outs = impl.compute(attrs, inputs, out_type)
outputs[impl] = outs
workload = autotvm.task.get_workload(outs)
if workload is None:
continue
cfg = dispatch_ctx.query(target, workload)
if cfg.is_fallback:
# It's a fallback config
continue
if best_cfg is None or best_cfg.cost > cfg.cost:
best_autotvm_impl = impl
best_cfg = cfg
if best_autotvm_impl:
return best_autotvm_impl, outputs[best_autotvm_impl]
return best_plevel_impl, outputs[best_plevel_impl]
Hello,I found the code here,seems if the ~/.tvm/tophub/llvm_0.04.log is not empty, tvm will select the auto-tvm config if exists, even if I set the target with ‘llvm -libs=cblas’.
So, if I want to test the cblas benchmark, I need to clear the tune log first. Is that the fact ?
AFAIK, tophub records should not include dense but only conv2d, so it should still use cBLAS for dense ops in your model.