I’m trying to deploy Jetson Xavier NX using TensorRT, following the tutorial here.
Following the original TVM template to build works fine:
###model is mobilenet_v2
mod, params = relay.frontend.from_pytorch(scripted_model, shape_list)
tgt = tvm.target.cuda()
ctx = tvm.gpu(0)
with tvm.transform.PassContext(opt_level=3):
g, m, p = relay.build(mod, tgt, params=params)
module = graph_runtime.create(g, m, ctx)
module.set_input(**p)
for i in range(15):
start = time.clock_gettime(time.CLOCK_REALTIME)*1000
data = np.random.uniform(-1, 1, (1,3,224,224)).astype("float32")
module.set_input("data", data)
module.run()
ctx_sync()
end = time.clock_gettime(time.CLOCK_REALTIME)*1000
print(end - start)
It seems the result for each opt_level is quite reasonable as follows, while they are median of 10 iterations excluding the first 5:
opt_level=0: 67.904052734375ms
opt_level=1: 44.286865234375ms
opt_level=2: 42.317626953125ms
opt_level=3: 39.987060546875ms
But if I use TensorRT template to build as below:
###model is mobilenet_v2
mod, params = relay.frontend.from_pytorch(scripted_model, shape_list)
from tvm.relay.op.contrib.tensorrt import partition_for_tensorrt
mod, config = partition_for_tensorrt(mod, params)
tgt = tvm.target.cuda()
ctx = tvm.gpu(0)
with tvm.transform.PassContext(opt_level=args.opt_level, config={'relay.ext.tensorrt.options': config}):
g, m, p = relay.build(mod, tgt, params=params)
module = graph_runtime.create(g, m, ctx)
module.set_input(**p)
for i in range(15):
start = time.clock_gettime(time.CLOCK_REALTIME)*1000
data = np.random.uniform(-1, 1, (1,3,224,224)).astype("float32")
module.set_input("data", data)
module.run()
ctx_sync()
end = time.clock_gettime(time.CLOCK_REALTIME)*1000
print(end - start)
The resulted latency is too short and it seems the context sync has not been processed:
opt_level=0: 5.004150390625ms
I’m quite sure because it shows the similar value if I remove ‘ctx.sync()’ from the original template.
I checked the issue from here and tried the same code but the result was the same.
Does anyone have idea about this?