In my case, tvm is used in intel x86 cpu.
With the quantization step, the time of time_evaluator is 10ms. On the contrary, the time of time_evaluator is 6ms.
My code as follows:
load model
sym, arg_params, aux_params = mx.model.load_checkpoint("./mobnetv2_1.0_224x224", 120)
fronted
dtype = {‘data’:‘float32’}
data_shape = {‘data’: (1, 3, 224, 224)}
sym, params = tvm.relay.frontend.from_mxnet(sym, shape=data_shape, dtype=dtype, arg_params=arg_params, aux_params=aux_params)
quantization
sym = tvm.relay.quantize.quantize(sym, params)
build
target = “llvm”
with relay.build_config(opt_level=3):
graph, lib, params = relay.build_module.build(sym, target=target, params=params)