Hi,
Is there any way to perform layer-wise profiling of a neural network constructed using NNVM Symbols? I have used the module.time_evaluator() which gives the total runtime of the module.
Thanks.
Hi,
Is there any way to perform layer-wise profiling of a neural network constructed using NNVM Symbols? I have used the module.time_evaluator() which gives the total runtime of the module.
Thanks.
For cuda, nvprof works well.
For x86, you can use VTune if you have an access to it. I have a good experience using vtune with NNVM.
Or you can modify here into something like this:
void Run() {
// setup the array and requirements.
for (size_t i = 0; i < op_execs_.size(); ++i) {
auto node = nodes_[i];
auto param = node.param;
auto func_name = param.func_name;
if (op_execs_[i]) {
auto t0 = std::chrono::system_clock::now();
op_execs_[i]();
auto t1 = std::chrono::system_clock::now();
auto elapsed_ms = std::chrono::duration_cast<std::chrono::milliseconds>(t1-t0).count();
LOG(INFO) << "Execute " << func_name << " took " << elapsed_ms << " msec.";
}
}
}
@masahi Thanks. I will try both !