In Pixel6, there are 2x X1 + 2x A76 + 4x A55 cores. Therefore, TVM thread pool thinks there are 2 big cores and 4 little cores. I would like to use all the 4 big cores (2x X1 + 2x A76) to profile models.
I have already tried the runtime.config_threadpool
method.
But I found it not working.
This is because every time I configured the thread pool to use all 4 big cores,
TVM will configure it back to only use 2x X1 afterward.
The following is how I use the method:
config_func(0, 0)
# I use TVM_NUM_THREADS=4 and "taskset f0" to control TVM to use the big cores
print(m.benchmark(ctx))
reports = []
for i in range(100):
config_func(0, 0)
reports.append(m.profile().csv())
You may ask how I know TVM only uses 2 threads.
The answer is that I logged inside Configure
function to see how num_workers_used
changed in: https://github.com/apache/tvm/blob/main/src/runtime/threading_backend.cc#L157.