Setting the CPU affinity and number of cores locally without RPC Remote

Hello,

This is a continuing discussion from Use all cores in a big.LITTLE architecture:

I am wondering how can we adjust CPU affinity and the numbers of the thread locally without using “remote setting”.

Now I was able to set the numbers of the thread through RPC.remtoe as the previous discussion shows (command as shown below).

image

However, I am still wondering can we set it locally without RPC Remote like the following figure shows? According to Can TVM split work into different layers, and assign layers into different cores? - #10 by hjiang, it seems it is possible to do so.

image

Does anyone have any idea about this question? Thanks :slight_smile:

cc @FrozenGene @hjiang

To be more precise on this question, I am working on Hikey 970 with 4 little cores (core id: 0123) and 4 big cores (core id:4567). I am working on splitting the entire relay graph into graphs and running it in pipeline format.

I know that with the following command, I can control the CPU number of threads locally.

export TVM_BIND_THREADS=1 (enable CPU affinity)
export TVM_NUM_THREADS= N (N=1~4  control Number of CPU you wanna use)

However, I have two questions on this problem:

  1. This command only allows me to use 1/2/3/4 big cores only and it seems that the default setting doesn’t allow users to use the small cores. Even if I set TVM_NUM_THREADS larger than 4, it still uses 4 big cores only. Is it possible to use small cores, or big and small cores (e.g. 3 big 4 small)?

  2. Moreover, this command would set all of the benchmarks using assign cores. Since I split the original graph into subgraphs, what I would like to have is actually assigning specific subgraphs to use specific cores (something like subgraph 0 → 2 big cores, subgraph 1 → 2 big cores, subgraph 2 → 4 small cores) instead of changing the entire setting globally.

@hjiang may I ask is it possible to share some of your implementations even though it’s still under development?

According to this post:

I tried to create a function in “module” (python/runtime/module.py) that hooks c++ into python so that I can directly use

module.module.get_global_function(‘runtime.config_threadpool’)

, which is very similar to what we did in remote.get_function(‘runtime.config_threadpool’)

remote is “RPsession(object)” with “get_function” which use “get_global_function” that can get global function. Therefore, I put get_global_function into python/runtime/module.py

image

And python/runtime/module.py will use ““get_global_func”” from tvm/_ffi/_ctypes/packed_func.py image

May I ask are the above steps the correct way to follow to enable such setting?

This problem is fixed by myslef. I was able to set it locally without RPC Remote.

image

What I did is very similar with the aforementioned setting: hooks c++ into python

In my python/runtime/module.py, I add “conf_set” function inside, image

to set the configuration, I can direct use this command just like we did in remote.get_function

config_threadpool = module.module.config_set(remote.get_function('runtime.config_threadpool')