Why the time cost of inferring resnet50-v2-7 with tvm-cuda much smallar than using tensorrt on x86?

It cost 0.11ms when I inferred resnet50-v2-7 with tvm-cuda on x86. Then I used the tensorrt to infer the same model with fp16, it cost 0.4ms . Is it normal that tvm-cuda so much faster than tensorrt on x86?