creddy
March 30, 2018, 11:32am
1
I observed that the time reported by time_evaluator doest not match the nvprof kernel timings. Is there a way to measure actual kernel execution time instead of wall clock time?
I tested some kernels. Their results are very close (error < 2%).
Could you try time_evaluator
with large number
(e.g. 1000)
with open(path_cc, "w") as f:
f.write(_PackImportsToC(self, is_system_lib))
files.append(path_cc)
if not fcompile:
if file_name.endswith(".tar"):
fcompile = _tar.tar
else:
fcompile = _cc.create_shared
fcompile(file_name, files, **kwargs)
def time_evaluator(self, func_name, ctx, number, repeat=1):
"""Get an evaluator that measures time cost of running function.
Parameters
----------
func_name: str
The name of the function in the module.
ctx: TVMContext
The context we should run this function on.