Setting TVM_NUM_THREADS
before starting RPC server on edge device does not affect tuning process speed. But setting TVM_NUM_THREADS
affects model evaluation time (ms).
Below are 3 examples of tuning with different TVM_NUM_THREADS
- None, 1 , 4.
As you can see the time metrics are the same - about 122 sec
#TVM_NUM_THREADS - time
unset - [Task 1/20] Current/Best: 0.42/ 1.95 GFLOPS | Progress: (48/88) | 121.26 s
1 - [Task 1/20] Current/Best: 0.73/ 1.45 GFLOPS | Progress: (48/88) | 122.30 s
4 - [Task 1/20] Current/Best: 1.07/ 2.30 GFLOPS | Progress: (48/88) | 121.97 s
But setting TVM_NUM_THREADS before running RPC server makes the difference for Model Evaluation:
#TVM_NUM_THREADS - time
unset - Mean inference time (std dev): 203.02 ms (0.05 ms) (top + 1 shows that runtime uses 2 cores out of 4)
1 - Mean inference time (std dev): 394.82 ms (0.03 ms)
4 - Mean inference time (std dev): 104.85 ms (0.03 ms)