Use all cores in a big.LITTLE architecture

Hello @FrozenGene. I do agree with we will get worse performance when using all cores.

I have run some simulations on TensorFlow lite benchmark and see the performance when using all cores. (Fig2)

But I am wondering how can I adjust the numbers of the thread. I have checked tvm/src/runtime/threading_backend.cc file and find out the default setting is using 4 big cores. I have tried to adjust the numbers of the thread (e.g. only using 1 small core, or only using 3 big cores.) but it seems the inference still using 4 cores.

Thanks.