I am trying to tune a model using autoschedule with RPC for a Jetson NX board on a x86 PC with RTX 3080.
This is what the board keeps reporting:
2023-12-06 17:58:57.199 INFO connected from ('[x86 PC IP]', 35042) 2023-12-06 17:58:57.203 INFO start serving at /tmp/tmpb2upll9w 2023-12-06 17:59:07.223 INFO timeout in RPC session, kill.. 2023-12-06 17:59:07.280 INFO finish serving ('[x86 PC IP]', 35042) 2023-12-06 17:59:07.375 INFO connected from ('[x86 PC IP]', 40966) 2023-12-06 17:59:07.380 INFO start serving at /tmp/tmplh201tek 2023-12-06 17:59:08.006 INFO finish serving ('[x86 PC IP]', 40966) 2023-12-06 17:59:08.102 INFO connected from ('[x86 PC IP]', 40978) 2023-12-06 17:59:08.108 INFO start serving at /tmp/tmp0h0uxuzy 2023-12-06 17:59:18.139 INFO timeout in RPC session, kill.. 2023-12-06 17:59:18.212 INFO finish serving ('[x86 PC IP]', 40978)
With timeout
set to 60 seconds, the RPC session still times out.
The way I set up the RPC system:
- Run a tracker on x86 PC by executing
python3 -m tvm.exec.rpc_tracker --host=0.0.0.0 --port=9190
- Run a server on the board by executing
python3 -m tvm.exec.rpc_server --tracker=[x86 PC IP]:9190 --key=jetson
- I can see the jetson board by executing
python3 -m tvm.exec.query_rpc_tracker --host=0.0.0.0 --port=9190
on x86 PC - So I run my autoschedule script with code like this:
mod, params = relay.frontend.from_onnx(...) target = tvm.target.cuda(arch="sm_72") # 72 For NX tasks, task_weights = auto_scheduler.extract_tasks( mod["main"], target=target, params=params,) tuner = auto_scheduler.TaskScheduler(tasks, task_weights) tune_option = auto_scheduler.TuningOptions( num_measure_trials=1000, runner=auto_scheduler.RPCRunner( key='jetson', host='127.0.0.1', port='9190', number=10, timeout=10), measure_callbacks=[auto_scheduler.RecordToFile(logPath)], ) tuner.tune(tune_option)
The TVM version: 0.15.dev0
I could tune the model with CUDA on board locally, but it is too slow. With RPC, it keeps saying timeout. Any suggestions?