lyq
April 25, 2018, 3:27am
1
I’m inspecting graph_runtime.cc and try to understand the internal inference data flow. I’ll deploy in server-class CPU.
My understanding is that graph_runtime.creat function bind Node with the corresponding operator when load graph.json and deploy.so.
When invoking the below function, the graph is evaluated in sequential.
void Run() {
// setup the array and requirements.
for (size_t i = 0; i < op_execs_.size(); ++i) {
if (op_execs_[i]) op_execs_i ;
}
}
How to make graph_runtime run parallel in graph level?
2 Likes
AFAIK, we don’t have operation-level parallelization yet. I agree that this is extremely useful for some networks like RNNs. Contributions are definitely welcomed.
aca88
March 21, 2019, 2:21pm
3
Hello,
I wanted to ask what is the status of this question?
I see that according to the repo code
It would seem that (at graph level) no parallelization is possible.
But if I inspect other parts of the code base (for example)
namespace tvm {
namespace runtime {
// stride in the page, fit to cache line.
constexpr int kSyncStride = 64 / sizeof(std::atomic<int>);
/*!
* \brief Thread local master environment.
*/
class ParallelLauncher {
public:
// Reset the the task request.
void Init(FTVMParallelLambda flambda,
void* cdata,
int num_task,
bool need_sync) {
num_pending_.store(num_task);
this->cdata = cdata;
this->flambda = flambda;
this->env.num_task = num_task;
Then I get the feeling that parallelization is implemented at other levels.
Can anyone please clarify?
Thanks
Hi, from my understanding, the parallelLauncher is parallelism within one operator, not graph-level parallelism. And currently there is no way to parallel different operators. I am also working on this issue, maybe we can talk about it.