Compiled model from auto Scheduler on Jetson is pretty slower in jetson nx

Hi,

I use to compile a mixed precision model in RTX-2070, TVM auto scheduler gave me 87 fps, and tensorrt gave me 52 fps.

But with the same model and same script, I could only get 6 fps on jetson xavier nx, but tensorrt gave me 15 fps.

Here are the tuning script

tune_fp16.py (github.com)

is this normal? since in rtx-2070, tvm is faster than tensorrt, i would expect this also holds in jetson nx.