Can anyone introduce me to how AutoTVM runs when it’s auto-tuning an operator on a CPU? Even better pointing me to any documentation if there’s one.
Specifically, my questions are:
-
By default does the xgboost python library run on one single CPU core? W/ or w/o multithreading? Is there any way to speed up the auto-tuning process by e.g. building from source with libraries like OpenMP? (I’m using Ubuntu 16.04.)
-
When AutoTVM is tuning an operator, how is the CPU usage shared between training and kernel runtime measurements? Does it affect the measurement accuracy if training and measurements are done on the same CPU chip as in that case the measurement job doesn’t get the full resources? Should I always set up the RPCtracker so that training is done on a separate host CPU and measurements are done on another one?
Thanks in advance?